# Process Optimization

In this session we will take the calibrated process we created before and try to find the optimal process parameters.

First, we should define how we _measure_ optimality.

Then, we will look into how we _find_ parameters that optimize these measurements.

## Section 1: Fractionation

In our example, our colleagues asked us to optimize the process to get the highest recovery of component "B" with a purity of above 95%.

Let's take our process model and with a random gradient configuration and try to evaluate the recovery and purity of our components.


To start, import the previously configured LRM-SMA process.

In [None]:
from CADETProcess.processModel import ComponentSystem
from CADETProcess.processModel import StericMassAction
from CADETProcess.processModel import Inlet, GeneralRateModel, Outlet, LumpedRateModelWithoutPores, TubularReactor
from CADETProcess.processModel import FlowSheet
from CADETProcess.processModel import Process
from CADETProcess.simulator import Cadet

In [None]:
component_system = ComponentSystem(['Salt', 'A', "B", "C"])

# Binding Model
binding_model = StericMassAction(component_system)
binding_model.is_kinetic = True
binding_model.adsorption_rate = [0, 1.3e-5, 5.59e-1, 9.5e-3]
binding_model.desorption_rate = [0, 1, 1, 1]
binding_model.characteristic_charge = [0, 6.9, 2.3, 5.8]
binding_model.steric_factor = [0, 10, 10.6, 11.83]
binding_model.capacity = 1.2e3

column = LumpedRateModelWithoutPores(component_system, name='column')
column.length = 0.014
column.total_porosity = 0.5
column.diameter = 0.01
column.axial_dispersion = 5.75e-7

pipe1 = TubularReactor(component_system, name="pipe1")
pipe1.length = 0.1
pipe1.diameter = 0.001
pipe1.axial_dispersion = 6e-6
pipe1.discretization.ncol = 50

pipe2 = TubularReactor(component_system, name="pipe2")
pipe2.length = 0.02
pipe2.diameter = 0.001
pipe2.axial_dispersion = 6e-6
pipe2.discretization.ncol = 50


column.binding_model = binding_model

column.q = [50, 0, 0, 0]
column.c = [50, 0, 0, 0]
pipe1.c = [50, 0, 0, 0]
pipe2.c = [50, 0, 0, 0]
column.volume

In [None]:
volumetric_flow_rate = 1.67e-8

inlet = Inlet(component_system, name='inlet')
inlet.flow_rate = volumetric_flow_rate

outlet = Outlet(component_system, name='outlet')

# Flow Sheet
flow_sheet = FlowSheet(component_system)

flow_sheet.add_unit(inlet, feed_inlet=True)
flow_sheet.add_unit(pipe1)
flow_sheet.add_unit(column)
flow_sheet.add_unit(pipe2)
flow_sheet.add_unit(outlet, product_outlet=True)

flow_sheet.add_connection(inlet, pipe1)
flow_sheet.add_connection(pipe1, column)
flow_sheet.add_connection(column, pipe2)
flow_sheet.add_connection(pipe2, outlet)

In [None]:
# Process
process = Process(flow_sheet, 'batch elution')

process.cycle_time = 6000

c_salt_load = 50
c_salt_gradient1_start = 86.89371792
c_salt_gradient1_end = 500
duration_gradient1 = 3127.53243
t_gradient1_start = 90
t_start_wash = 10
gradient_1_slope = (c_salt_gradient1_end - c_salt_gradient1_start)/(process.cycle_time - t_gradient1_start)

c_load = [c_salt_load, 1, 1, 1]

c_wash = [c_salt_load, 0, 0, 0]

c_gradient1_poly = [
    [c_salt_gradient1_start, gradient_1_slope, 0, 0],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 0, 0, 0]
]

process.add_duration("grad1_duration", duration_gradient1)

process.add_event('load', 'flow_sheet.inlet.c', c_load, 0)
process.add_event('wash', 'flow_sheet.inlet.c', c_wash, t_start_wash)
process.add_event('grad1_start', 'flow_sheet.inlet.c', c_gradient1_poly, t_gradient1_start)

In [None]:
simulator = Cadet()
simulator.time_resolution = 5

# process.plot_events()

simulation_results = simulator.simulate(process)
print(simulation_results.time_elapsed)

from CADETProcess.plotting import SecondaryAxis

sec = SecondaryAxis()
sec.components = ['Salt']
sec.y_label = '$c_{salt}$'

simulation_results.solution.outlet.outlet.plot(secondary_axis=sec)
import matplotlib.pyplot as plt

plt.tight_layout()

The requirements for the "optimal" process we got from our colleagues were:
- Collect a fraction for component B
- Purity above 95% for component B
- Highest possible recovery

What exactly do these metrics mean?

## Key Performance Indicators (KPI)

### Purity

$$
PU_{i} = \frac{m_{i}^{i}}{\sum_{l=1}^{n_{comp}} m_{l}^{i}},\\
$$
where $n_{comp}$ is the number of mixture components and $m_{l}^{i}$ is the mass of component $l$ in target fraction $i$.

### Recovery Yield
$$
Y_{i} = \frac{m_i}{m_{feed, i}},\\
$$
with $m_{feed}$: injected amount of mixture.

### Productivity
$$
PR_{i} = \frac{m_i}{V_{solid} \cdot \Delta t_{cycle}},\\
$$
with $V_{solid}$: volume of stationary phase.

### Eluent Consumption
$$
EC_{i} = \frac{V_{solvent}}{m_i},\\
$$
with $V_{solvent}$: solvent used during a cycle.

Luckily we do not need to calculate them by hand.

The `Fractionator` class allows slicing the solution and pool fractions for the individual components.

It enables evaluating multiple chromatograms at once and multiple fractions per component per chromatogram.

To add a fractionation event, the following arguments need to be provided:
- `event_name`: Name of the event.
- `target`: Pool to which fraction is added. `-1` indicates waste.
- `time`: Time of the event
- `chromatogram`: Name of the chromatogram. Optional if only one outlet is set as `chromatogram_sink`.

Here, component $B$ seems to have sufficient purity between 30 minutes and 34 minutes and component $C$ between 45 and 60 minutes.

The `performance` object of the `Fractionator` contains the parameters:

The chromatogram can be plotted with the fraction times overlaid:

## Optimization of Fractionation Times
- The `fractionation` module provides tools to automatically determines optimal cut times.
- By default, the mass of the components is maximized under purity constraints.
- Different purity requirements can be specified for each component

To automatically optimize the fractionation times, pass the simulation results to the `optimize_fractionation` function.

The results are stored in a `Performance` object.

The chromatogram can also be plotted with the fraction times overlaid:

For comparison, this is the results if two components are relevant:

We can add a `FractionationOptimizer` to the optimization chain. This way, the optimizer will, during each optimization step:
1. run a simulation with some set of parameters
2. run a fractionation-optimization on the resulting chromatogram
3. calculate and report the target metric (yield, purity, time, etc.)

First, again, create an `OptimizationProblem`:

Then we can add a `FractionationOptimizer` as an evaluator, and add the `Recovery` as our objective.

Now we can identify the optimal solution if we find it. Next step: set up our optimization problem to search for it.

## Section 2: Constrained Optimization

Let us collect process parameters we could tune to optimize the separation process.

In [None]:
from CADETProcess.plotting import SecondaryAxis

sec = SecondaryAxis()
sec.components = ['Salt']
sec.y_label = '$c_{salt}$'

simulation_results.solution.outlet.outlet.plot(secondary_axis=sec)
import matplotlib.pyplot as plt

plt.tight_layout()

For today, let us focus on a two-gradient scenario. This gives the following parameters:

1. Gradient
   - starting concentration
   - starting time
   - slope
   - duration
2. Gradient
   - starting concentration
   - starting time
   - slope
   - duration

Some of these parameters depend on one another. Can you give an example?

First, we'll need to add the missing event.

Now let's add the `grad1_start` `time` variable and the `grad1_start` `concentration` variable:

In [None]:
# gradient1 slope
optimization_problem.add_variable(
    "gradient1_slope",
    lb=0.001, ub=10, indices=(0, 1),
    parameter_path='grad1_start.state'
)

optimization_problem.add_variable(
    'grad1_duration.time',
    lb=120,
    ub=5790
)

# gradient2 start time
var = optimization_problem.add_variable(
    "grad2_start.time",
    lb=500,
    ub=900,
)

# gradient2 start concentration
var = optimization_problem.add_variable(
    "gradient2_start_conc", lb=1, ub=1e5, indices=(0, 0),
    parameter_path='grad2_start.state'
)

# gradient2 slope
optimization_problem.add_variable(
    "gradient2_slope", lb=-100, ub=1e6, indices=(0, 1),
    parameter_path='grad2_start.state'
)

Now that we have all variables defined, let's have a look at the four types of constraints we can use:

```
|              | Linear | Nonlinear |
|--------------|--------|-----------|
| Equality     |        |           |
| Non-equality |        |           |
```

### Linear equality constraints

The start duration of the second gradient is a good example of a linear equality constraint.

(Even though it could be done with event dependencies.)

We can formualte it as:

```
grad1_start.time + grad1_duration = grad2_start.time
```

or, restructured in a way that is more common in constraint formulations:

```
1 * grad1_start.time + 1 * grad1_duration - 1 * grad2_start.time = 0
```

This is a formulation according to the standard:

$$
A_{eq} \cdot x = b_{eq}
$$

with $A_{eq} = (1, 1, -1)$ and $ b_{eq} = 0$.


In **CADET-Process**, add each row $a_{eq}$ of the constraint matrix needs to be added individually.
The `add_linear_equality_constraint` function takes the variables subject to the constraint as first argument.
The left-hand side $a_{eq}$ and the bound $b_{eq, a}$ are passed as second and third argument.
It is important to note that the column order in $a$ is inferred from the order in which the optimization variables are passed.



to the optimization problem, add the following:

To wheck if a point fulfils the linear equality constraints, use the `check_linear_equality_constraints` method.
It returns `True` if the point is within bounds and `False` otherwise.

### Linear inequality constraints

Linear inequality constraints work much like linear equality constraints.

$$
A \cdot x \leq b
$$

In **CADET-Process**, add each row $a$ of the constraint matrix needs to be added individually.
The `add_linear_constraint` function takes the variables subject to the constraint as first argument.
The left-hand side $a$ and the bound $b_a$ are passed as second and third argument.
It is important to note that the column order in $a$ is inferred from the order in which the optimization variables are passed.

There are no inequality constraints in our example, but we could add a step with constant salt concentration between gradient 1 and gradient 2, which would turn the equality constraint from above into:

```
grad1_start.time + grad1_duration <= grad2_start.time
```

### Nonlinear constraints

It is also possible to add nonlinear constraints to the `OptimizationProblem`.

The salt concentration of the first gradient is a good example of a non-linear inequality constraint.

We have been told from our lab-colleagues, that the salt concentration after the first gradient should not exceed 1000 mM.

We can formualte this requirement as:

```
gradient1_start_conc + gradient1_slope * grad1_duration <= 1000
```

This is a formulation according to the standard:


$$
g(x) \le 0 \\
$$

Nonlinear constraints need to be added as a callable functions.

Note that multiple nonlinear constraints can be added.

In addition to the function, lower or upper bounds can be added.

In [None]:
optimization_problem.variable_names

Again, the function can be evaluated manually.

## Nonlinear equality constraints / aka variable dependencies

Lastly, we can add variable dependencies that rely on non-linear combinations of other variables. These also need to be added as a callable.

We could make the start concentration of the second gradient depend on the parameters of the first gradient, as:

```
gradient2_start_conc = gradient1_start_conc + gradient1_slope * grad1_duration
```

In [None]:
optimization_problem.variable_names

In [None]:
optimization_problem.add_variable_dependency(
    dependent_variable="gradient2_slope",
    independent_variables=["grad1_start_conc", "gradient1_slope", "grad1_duration.time"],
    transform=lambda start_conc, slope, duration:
    (1000 - (start_conc + slope * duration)) / (process.cycle_time - 90 - duration)
)

In [None]:
optimization_problem.add_variable_dependency(
    "grad2_start.time",
    ["grad1_start_time", "grad1_duration.time" ],
    transform=lambda starttime, duration: starttime+duration
)

In [None]:
def callback(fractionation, individual, evaluation_object, callbacks_dir):
    fractionation.plot_fraction_signal(
        file_name=f'{callbacks_dir}/{individual.id[:10]}_{evaluation_object}_fractionation.png',
        show=False
    )

optimization_problem.add_callback(
    callback, requires=[simulator, fractionation_optimizer]
)

def callback_sim(simulation_results, individual, callbacks_dir='./'):
    sec = SecondaryAxis()
    sec.components = ['Salt']
    sec.y_label = '$c_{salt}$'

    simulation_results.solution.outlet.outlet.plot(
        secondary_axis=sec,
        show=False,
        file_name=f'{callbacks_dir}/{individual.id[:10]}.png'
    )

optimization_problem.add_callback(
    callback_sim, requires=[simulator, ]
)

In [None]:
from CADETProcess.optimization import U_NSGA3

optimizer = U_NSGA3()
optimizer.n_cores = 6
optimizer.pop_size = 60
optimizer.n_max_gen = 4

In [None]:
optimization_results = optimizer.optimize(
    optimization_problem
)