# Multi-objective optimisation demonstration

This notebook demonstrates how to use the BioPharma Python software to *optimise* models of biopharmaceutical facilities against *multiple objectives*.

To run everything, select 'Run All' from the Cell menu. If you have made changes to the model equations in the biopharma package, select 'Restart & Run All' from the Kernel menu to ensure your changes are loaded.

First we load the biopharma software and set up the facility we want to optimise. See the [introductory demo notebook](User_demo.ipynb) for more explanation about this section. As in that example, we load default parameters from the [data](./data) folder by specifying it as the `data_path` argument to `Facility`.

In [None]:
import biopharma as bp

facility = bp.Facility(data_path='data')

# Define the steps needed to create our single product
from biopharma.process_steps import (
    
)
steps = [
    
]
step_names = [step.name for step in steps]
product = bp.Product(facility, steps)

## Defining what to optimise

The next step is to create the Optimiser, and tell it both what facility parameters we want to vary (and how), and what objective(s) to optimise.

Note that the order in which variables are added matters: choices made for earlier variables can constrain the options available for later ones, but not vice versa. This could make a difference, depending on the model.

Key parameters for the optimisation routine are loaded from file in the same way as default facility parameters. Unless a different filename is given when the Optimiser is created, the [Optimiser.yaml](data/Optimiser.yaml) file is used.

Note that in contrast to the single-objective optimisation demo, here we define 2 objectives.

In [None]:
from biopharma import optimisation as opt

optimiser = opt.Optimiser(facility)

# Specify the variables to optimise.
optimiser.add_variable(gen=opt.gen.Binary(), component=opt.sel.step('test_step'), item='binary_param',
                       track=opt.Tracking.discrete)
optimiser.add_variable(gen=opt.gen.RangeGenerator(0, 10),
                       component=opt.sel.step('test_step'), item='int_param', track=opt.Tracking.numerical)

# Specify the objective(s)
optimiser.add_objective(component=opt.sel.product(0), item='cogs', minimise=True)
optimiser.add_objective(component=opt.sel.product(0), item='other_output', maximise=True)

Now we can run the optimisation. Outputs will be stored in the optimiser's outputs dictionary.

The commented-out lines (starting with a '# ') show how to override parameters defined in the [Optimiser.yaml](data/Optimiser.yaml) file.

In [None]:
# optimiser.parameters['populationSize'] = 10
# optimiser.parameters['maxGenerations'] = 10
optimiser.run()

Note that the 'best' individuals from a multi-objective optimisation are simply those with the 'best' fitness separately in each objective. They thus give a limited perspective on the fitness space. More information can be gained from plotting the full population, shown further below.

In [None]:
print('Best individual(s) with fitness(es):')
for ind in optimiser.outputs['bestIndividuals']:
    print('{}: {}'.format(ind.fitness.values, ind))
print('Fitnesses ({}) of final population:'.format(
    ', '.join(map(str, [obj['item'] for obj in optimiser.objectives]))))
for ind in optimiser.outputs['finalPopulation']:
    print('   ', ind.fitness.values)

## Reporting on the results

In [None]:
# Importing the libraries we need
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib notebook

If multi-objective optimisation has been performed, we can plot where in fitness space the final population lies, and in particular the separate Pareto fronts showing which individuals are dominated by others. The rank 1 Pareto front gives the typical interpretation of optimal individuals.

First we determine which Pareto front each individual falls within (TODO: do this within the code instead).

In [None]:
from deap.tools import sortNondominated
all_fronts = sortNondominated(optimiser.outputs['finalPopulation'], optimiser.parameters['populationSize'])
print('Pareto front sizes:', list(map(len, all_fronts)))
for rank, members in enumerate(all_fronts):
    for ind in members:
        ind.pareto_rank = rank + 1

Now we can plot (at least with 2 objectives) all the Pareto fronts, showing where they sit within fitness space. The size of the circles indicates how many individuals have that fitness, and the colour indicates which Pareto front they are within (with rank 1 being the best).

In [None]:
if len(optimiser.objectives) == 2:
    fig, ax = plt.subplots()
    
    pop = optimiser.outputs['finalPopulation']
    from collections import defaultdict
    map_fit_ind = defaultdict(list)
    for ind in pop:
        map_fit_ind[ind.fitness].append(ind)
    fitnesses = list(map_fit_ind.keys())
    
    for rank in sorted({ind.pareto_rank for ind in pop}):
        fits = [fit for fit in fitnesses if map_fit_ind[fit][0].pareto_rank == rank]
        front = [map_fit_ind[fit][0] for fit in fits]
        sizes = [2 * len(map_fit_ind[fit]) for fit in fits]
        xs = [ind.fitness.values[0] for ind in front]
        ys = [ind.fitness.values[1] for ind in front]
        ax.scatter(xs, ys, s=sizes, alpha=0.5, label='Rank {}'.format(rank))
    ax.set_title('Pareto Fronts')
    ax.set_xlabel(optimiser.objectives[0]['item'])
    ax.set_ylabel(optimiser.objectives[1]['item'])
    ax.legend(loc='best')
    ax.grid(True)
    plt.show()

We can also pick a single 'best' individual from the set and show properties thereof as before.

In [None]:
best = optimiser.outputs['bestIndividuals'][0]
labels = {'param1': {False: 'P1 not set', True: 'P1 set'},
          'param2': {False: 'Here be dragons', True: 'All normal'}}
column_info = pd.DataFrame(
    {name: [labels[item][best.get_variable(name, item).value]
            for item in ['param1', 'param2']]
     for name in step_names},
    index=['Param 1', 'Param 2'])
column_info

Next we display a graph showing some non-categorical parameters.

In [None]:
best = optimiser.outputs['bestIndividuals'][0]

fig, ax = plt.subplots()
for name in step_names:
    x = best.get_variable(name, 'p1').value
    y = best.get_variable(name, 'p2').value
    ax.scatter(x.magnitude, y.magnitude, s=40, label=name)
ax.set_title('Two parameters')
ax.set_xlabel('P1 ({})'.format(x.units))
ax.set_ylabel('P2 ({})'.format(y.units))
ax.legend()
ax.grid(True)
plt.show()

Finally we show the cost breakdown into different categories for the 'best' solution.

In [None]:
best = optimiser.outputs['bestIndividuals'][0]

# At present we need to re-run the model with the best individual to get at specific outputs
best.apply_to_facility()
facility.run()

# Cost breakdown data
y_units = bp.units.GBP / bp.units.g
grams_produced = product.outputs['total_output'].to('g')
labour_costs = product.outputs['labourCost'] / grams_produced / y_units
consumables_costs = product.outputs['consumablesCost'] / grams_produced / y_units

# Create plot
fig, ax = plt.subplots()

p_labour = ax.bar(0, labour_costs)
p_consumables = ax.bar(0, consumables_costs, bottom=labour_costs)

ax.set_title('Cost Breakdown')
ax.set_ylabel('Cost of goods ({})'.format(y_units))
ax.set_xticks([])
ax.set_xbound(-2, 2)
ax.legend((p_labour[0], p_consumables[0]),
           ('Labour', 'Consumables'))
plt.subplots_adjust(bottom=0.25)
plt.show()