In [1]:
# Resolve path when used in a usecase project
import sys
from pathlib import Path

sys.path.insert(0, str(Path("../../").resolve()))

In [2]:
import logging
import sys

logging.basicConfig(level=logging.INFO, stream=sys.stdout)

In [3]:
import recommend
print(f'Using {recommend.__version__} version of recommend package')

INFO:numexpr.utils:NumExpr defaulting to 8 threads.
Using 0.33.0 version of recommend package


# `recommend` package tutorial

## Prerequisites

Though this package assumes one has a knowledge of the main concepts of `oai.optimizer` package ([refer to these docs](https://brix.quantumblack.com/products/optimus/docs/src/packages/optimizer/src/optimizer/README.html) to learn more), we have a short summary of the main concepts of the `optimizer` package.

**TL;DR**

* It implements the ask and tell optimization paradigm.
* Introduces `OptimizationProblem`, `Solver`, and `Stopper` abstractions. The `Problem` encapsulates business-specific details. Solver only cares about the values to propose to improve the solution. And stopper defines additional criteria for solver termination.
* Optimizer doesn't make any assumptions about the optimized objective nor constraints. It provides heuristic solvers which don't depend on the structure of the objective which lets us use models in objective and/or constraints.
* Optimizer refers to the objective as the main metric we're optimizing. It can be either a formula or an ML model.
* Constraints can be either expressed by **repairs** or **penalties**:
    * **penalties** – soft constraints; if a solution violates them, then penalty is included in objective with correct sign (depending on the problem's sense). Is expressed by `optimizer.Penalty` class which can be created directly or by `optimizer.penalty` function.
    * **repairs** – alternative way of handling the constraints; the idea is to repair the parameters proposed by solver (once they're violating constraints ofc) instead of penalising the objective. Is expressed by `Repair` class which can be created directly or by `repair` function.
* The last and the simples entity is `StopperBase` which has several implementations in `optimizer.stoppers` it helps to formulate the additional conditions for stopping the optimization routine (e.g., solver didn't produce improvements for last n iterations – `NoImprovementStopper`, we've met all constraints - `SatisfiedConstraintsStopper`, etc.)

<div class="alert alert-info">
<b>Note</b>

Recommend works with `StatefulOptimizationProblem` abstraction and its implementations.
    
</div>

## Introduction

The overall goal of the `recommend` package is to make optimization easier in the OAI context. Hence, the package provides high-level interfaces to deal with `optimizer`'s primitives. This includes the following entities:
* `ControlledParametersConfig` – a mapping which stores information (min/max optimization limits, max delta and step size) for every controlled parameter
* `ProblemFactory` – contains definition of problem creation (objective, penalties, repairs); is used to create a problem class for each optimizable row
* `SolverFactory` – creates `Solver` instance based on provided controls config, domain generator, solver and (optional) stopper definition
* `optimize` – function which aggregates `ProblemFactory`, `SolverFactory`, and `StopperFactory` to conduct optimization for the provided `data`.

The notebook will walk you through the usage of each one of those. 

## Glossary

In the `recommend` package, we use the following wording.

`State` – all the parameters used to define current plant conditions. These parameters are used by the problem for objective/penalties/repairs. We split those parameters into two categories:

* `control` – parameters that we can operate; solver will change those in order to improve objective
* `context` – parameters that we cannot operate; we assume (which might not be always true) those are independent of the controls, hence changing any of `control` variables won't influence `context` change

<div class="alert alert-info">
<b>Note</b>

In case there are variables that define `state` yet depend on `control` variables, we suggest either re-calculating them manually in the objective/penalties/repairs or predicting those using additional models to.
</div>


## Example data

This is the example dataset that we will use for showcasing `recommend` package functionality.

The data tells us the current values of our parameters. At the end of this notebook, we'll produce recommendations for the controlled parameters at each given time step.

In [4]:
from recommend import datasets

df = datasets.get_sample_recommend_input_data()
df.head()

Unnamed: 0,timestamp,air_flow01,air_flow02,air_flow03,air_flow04,air_flow05,air_flow06,air_flow07,amina_flow,column_level01,...,ore_pulp_flow,ore_pulp_ph,silica_conc,silica_feed,starch_flow,iron_minus_silica,feed_diff_divide_silica,total_column_level,total_air_flow,silica_conc_lagged
0,2017-08-30 23:00:00,299.88293,299.682324,299.884393,299.617858,300.0,300.0,300.0,522.255576,399.943846,...,379.969373,,15.023342,18.513333,3199.638567,34.733333,1.876125,2598.407744,2099.067504,3.964062
1,2017-08-31 02:00:00,299.947659,299.800402,299.931837,299.436669,300.0,300.0,300.0,492.844533,,...,,,14.987169,24.9,2630.064985,23.56,0.946185,2611.246633,2099.116568,4.122605
2,2017-08-31 05:00:00,299.296059,299.392591,299.929091,299.255481,300.0,300.0,300.0,455.151107,,...,380.291418,9.107695,14.170544,22.773333,2634.493391,27.666667,1.214871,2654.60384,2097.873222,3.076667
3,2017-08-31 08:00:00,299.994928,299.894504,299.948689,299.074293,300.0,,300.0,484.799569,437.098448,...,380.088518,9.447949,10.17083,18.52,2861.101697,35.88,1.937365,3187.242855,2098.912414,1.8
4,2017-08-31 11:00:00,299.916354,299.890941,299.889231,298.893105,300.0,300.0,298.207261,415.15145,515.49417,...,,9.312112,11.712113,18.52,3716.986877,35.88,1.937365,3268.605317,2096.796892,1.556667


We'll also extract a row from this dataset to showcase how a row is optimized.

In [5]:
row_to_optimize = df.iloc[[0]]
row_to_optimize

Unnamed: 0,timestamp,air_flow01,air_flow02,air_flow03,air_flow04,air_flow05,air_flow06,air_flow07,amina_flow,column_level01,...,ore_pulp_flow,ore_pulp_ph,silica_conc,silica_feed,starch_flow,iron_minus_silica,feed_diff_divide_silica,total_column_level,total_air_flow,silica_conc_lagged
0,2017-08-30 23:00:00,299.88293,299.682324,299.884393,299.617858,300.0,300.0,300.0,522.255576,399.943846,...,379.969373,,15.023342,18.513333,3199.638567,34.733333,1.876125,2598.407744,2099.067504,3.964062


<div class="alert alert-info">
<b>Note</b>
    
A row has DataFrame type. This is done to comply with `optimizer` interface.
</div>

## `ControlledParametersConfig`

This is a mapping that stores the information about controlled parameters. That information includes:
* `name` - control's name; used as column name in the data
* `op_min` – min value which control can take during the optimization
* `op_max` – max value which control can take during the optimization
* `max_delta` - optional (a change won't be constrained if the delta is not provided) max change that can be made to control its current value
* `step_size` – optional (only used by `DiscreteDomainGenerator`); fixed step change to make for control
* `constraint` - optional; use that if you want to constaint control to be only increasing/decreasing; takes one of three values:
    * None – no constraint to control is applied
    * "decrease" – control can only take values lower than its current value
    * "increase" – control can only take values higher than its current value

Let's create one for our silica dataset. So in this task we can control following parameters:
* `starch_flow`
* `ore_pulp_ph`
* `ore_pulp_density`
* `amina_flow`
* `ore_pulp_flow`
* `total_column_level`
* `total_air_flow`

We have a config yaml file that defines a list of parameters for those controls:
```yaml
- name: starch_flow
  op_min: 3000
  op_max: 4000
  step_size: 200
  max_delta: 800
- name: amina_flow
  op_min: 450
  op_max: 650
  step_size: 50
  max_delta: 100
...
```

Let's load that yaml into a list of dicts `raw_config`.

In [6]:
raw_config = datasets.get_sample_controlled_parameters_raw_config()
raw_config

[{'name': 'starch_flow',
  'op_min': 3000,
  'op_max': 4000,
  'step_size': 200,
  'max_delta': 800},
 {'name': 'amina_flow',
  'op_min': 450,
  'op_max': 650,
  'step_size': 50,
  'max_delta': 100},
 {'name': 'ore_pulp_flow',
  'op_min': 400,
  'op_max': 410,
  'step_size': 2,
  'max_delta': 10},
 {'name': 'ore_pulp_ph',
  'op_min': 9.5,
  'op_max': 10.5,
  'step_size': 0.05,
  'max_delta': 0.4},
 {'name': 'ore_pulp_density',
  'op_min': 1.65,
  'op_max': 1.75,
  'step_size': 0.1,
  'max_delta': 0.1},
 {'name': 'total_air_flow',
  'op_min': 1000,
  'op_max': 2000,
  'step_size': 100,
  'max_delta': 500},
 {'name': 'total_column_level',
  'op_min': 1000,
  'op_max': 5000,
  'step_size': 200,
  'max_delta': 1000}]

Now we can create a `ControlledParametersConfig`.

In [7]:
from recommend import ControlledParametersConfig

controlled_parameters_config = ControlledParametersConfig(raw_config)
controlled_parameters_config

ControlledParametersConfig(
    keys={
        'amina_flow', 'ore_pulp_density', 'ore_pulp_flow', 'ore_pulp_ph',
        'starch_flow', 'total_air_flow', 'total_column_level',
    },
    values=(...),
)

### `.from_dataframe()`

Alternatively, users can use the dataframe (in case one uses TagDict or stores config in dataframe format) to init the same structure. Let's create a table a try loading it via `ControlledParametersConfig.from_dataframe`.

In [8]:
import pandas as pd

df_raw_config = pd.DataFrame(raw_config)
df_raw_config

Unnamed: 0,name,op_min,op_max,step_size,max_delta
0,starch_flow,3000.0,4000.0,200.0,800.0
1,amina_flow,450.0,650.0,50.0,100.0
2,ore_pulp_flow,400.0,410.0,2.0,10.0
3,ore_pulp_ph,9.5,10.5,0.05,0.4
4,ore_pulp_density,1.65,1.75,0.1,0.1
5,total_air_flow,1000.0,2000.0,100.0,500.0
6,total_column_level,1000.0,5000.0,200.0,1000.0


In [9]:
controlled_parameters_config = ControlledParametersConfig.from_dataframe(df_raw_config)
controlled_parameters_config

ControlledParametersConfig(
    keys={
        'amina_flow', 'ore_pulp_density', 'ore_pulp_flow', 'ore_pulp_ph',
        'starch_flow', 'total_air_flow', 'total_column_level',
    },
    values=(...),
)

So we've created the same controls config.

## `ProblemFactory`

### Why do we need problem factory?

This class collects all the entities needed for problem creation and streamlines this process.

`Problem` is an entity that contains all the information about the optimization task for a given row, including objective, penalties, and repairs. Since each of those elements might be specific for the row, we are optimizing, we need to define how each one of those is created.

Once a factory is created, we can create a problem for the input row.

### General problem definition

Since our initial data has different input, state and control variables at each index, the optimization task will also be different.

Our optimization problem for any index will be:

$$
\begin{align}
&\min_{\mathbf{x} \in F} & f(\mathbf{x}) & \\
&\text{s.t.} & x_{\text{amina flow}} + x_{\text{pulp flow}} \leq 3600 & \\
&& x_{\text{ore pulp ph}} \geq x^{\text{initial}}_{\text{ore pulp ph}} \\
&& x_{\text{ore pulp density}} \leq x^{\text{initial}}_{\text{ore pulp density}}
\end{align}
$$

where $f$ is our model and $F$ is the feasible set of boundaries defined by the `"op_min"`, `"op_max"`, `"max_delta"`, and `"step_size"` given the domain generator. Subject to total flow and ore pulp ph and density constraints.

To create such problem for each index, we can define a silica problem factory:
* Objective: the objective will be a simple model inference. We'll load our pre-trained model below and plug it into the model factory. Note that the model was trained to predict target column `"silica_conc"`.
* Constraints: for this example, we'll use a single penalty and some repairs. Again, in more practical examples, there are likely to be _many_ more constraints that vary in complexity.

### Load model for objective

In [10]:
silica_conc_model = datasets.get_trained_model()
silica_conc_model

We'll then pass that model to an objective function and use it to evaluate the silica concentration.

<div class="alert alert-info">
<b>Note</b>
    
This model includes a feature selection so all states can be passed to model, and it'll work as expected. We suggest you keep this invariant and do all features selection inside the model to avoid extra hassle with passing those details from outer configs. This can be easily achieved by using `ModelBase` mixin from modeling package.
</div>

In [11]:
from sklearn.pipeline import Pipeline


def calculate_objective(
    parameters: pd.DataFrame,
    silica_conc_model: Pipeline,
) -> pd.Series:
    """ Returns objective value estimated by silica conc model. """
    return silica_conc_model.predict(parameters)

### Define simple penalty

We'd like to introduce a penalty for total amount of flows `starch_flow` and `amina flow`. If the total value goes above 3600, we penalize the objective by the distance from this threshold multiplied by `0.0125`. To do so, we'll use the `penalty` function from the OptimusAI `optimizer` package.

In [12]:
from optimizer import penalty


def calculate_flow_penalty(parameters: pd.DataFrame) -> pd.Series:
    return parameters["starch_flow"] + parameters["amina_flow"]


flow_penalty = penalty(
    calculate_flow_penalty,
    "<=",
    3600,
    name="starch_and_amina_flow",
    penalty_multiplier=0.0125,
)

### Define simple repairs

Typically we use repairs to fix some "unrealistic" bahaviour in our ML model that we use as a part of the objective. Consider following toy example: once we change all the controls, ph might not be feasible and plant operators require additional constraints for it. So we have to introduce additional model that'll tell us what's the max value that we can set for the `"ore_pulp_ph"` in this given state with new controls. To simplify the example we'll use random values instead of model in `predict_max_possible_ph`. However, on a real study you'll use model here.

Check out [common constraints tutorial](./common_constraints.ipynb) to get more  constraints you might need for your study.

In [13]:
import numpy as np

from optimizer import repair


def predict_max_possible_ph(df):
    rand = np.random.RandomState(42)
    new_flow_level = rand.rand(df.shape[0])
    return new_flow_level


def reset_ph_value(df):
    df["ore_pulp_ph"] = row_to_optimize["ore_pulp_ph"]
    return df

In [14]:
ore_pulp_ph_repair = repair(
    "ore_pulp_ph",
    "<=",
    predict_max_possible_ph,
    repair_function=reset_ph_value,
)
ore_pulp_ph_repair

<optimizer.constraint.repair.UserDefinedRepair at 0x7faff97dd3f0>

<div class="alert alert-info">
<b>Note</b>
    
As you can see, the repair will be different for each row since we're the repairing function depends on `row_to_optimize` (initial contol's value).

So we'll be calling those functions in the repair definition of the problem factory, `_create_repairs`.

</div>

### `SilicaProblemFactory` definition

Now we will collect all those pieces into the problem factory.

In [15]:
import typing as tp
from functools import partial

import numpy as np
import pandas as pd

from recommend import ObjectiveFunction, ProblemFactoryBase
from optimizer import Penalty, Repair


class SilicaProblemFactory(ProblemFactoryBase):
    def _create_objective(self, row_to_optimize: pd.DataFrame) -> ObjectiveFunction:
        """
        Returns the objective. It will be used
        to create `optimizer.StatefulOptimizationProblem`.

        Note that the returned function must take only `parameters` as an input
        i.e., all the additional kwargs must be wrapped via `functools.partial`
        before returning the function

        This objective can be either maximized or minimized based on
        the `ProblemFactory.sense` (provided during init).
        """

        return partial(
            calculate_objective,
            silica_conc_model=self._model_registry["silica_conc_model"],
        )

    def _create_penalties(self, row_to_optimize: pd.DataFrame) -> tp.List[Penalty]:
        """Returns penalties. Currently, it includes only flow penalty."""

        flow_penalty = penalty(
            calculate_flow_penalty,
            "<=",
            3600,
            name="starch_and_amina_flow",
            penalty_multiplier=0.0125,
        )
        return [flow_penalty]

    def _create_repairs(self, row_to_optimize: pd.DataFrame) -> tp.List[Repair]:
        """
        Creates repairs:
        * "ore_pulp_ph" will be repaired based on the predict_max_possible_ph predictor

        """


        ore_pulp_ph_repair = repair(
            "ore_pulp_ph",
            "<=",
            predict_max_possible_ph,
            repair_function=partial(
                reset_value, 
                column="ore_pulp_ph",
                value=row_to_optimize["ore_pulp_ph"],
            ),
            check_repaired="never",  # remove this (needed only for this toy example)
        )

        return [ore_pulp_ph_repair]

    
def calculate_objective(
    parameters: pd.DataFrame,
    silica_conc_model: Pipeline,
) -> pd.Series:
    """ Returns objective value estimated by silica conc model. """
    return silica_conc_model.predict(parameters)


def calculate_flow_penalty(parameters: pd.DataFrame) -> pd.Series:
    return parameters["starch_flow"] + parameters["amina_flow"]



def predict_max_possible_ph(df):
    rand = np.random.RandomState(42)
    new_flow_level = rand.rand(df.shape[0])
    return new_flow_level


def reset_value(df, column, value):
    df[column] = value
    return df

<div class="alert alert-info">
<b>Note</b>

We use `model_registry` attribute to provide models to the objective/penalties/repairs. This registry is provided to the factory and then stored once its instance is created (see the example below).

</div>

### Create problem factory instance

Now, let's create a factory instance. To do that, we'll need to provide controlled parameters config, our problem class, its kwargs and silica concentration predictor.

Since the goal is to have as less silica as possible, we'll provide sense="minimize" as part of problem's kwargs.

In [16]:
from optimizer import StatefulOptimizationProblem

problem_factory = SilicaProblemFactory(
    controlled_parameters_config=controlled_parameters_config,
    problem_class=StatefulOptimizationProblem,
    problem_kwargs={"sense": "minimize"},
    model_registry={"silica_conc_model": silica_conc_model},
)
problem_factory

SilicaProblemFactory(
    optimized_columns=ControlledParametersConfig(
        keys={
            'amina_flow', 'ore_pulp_density', 'ore_pulp_flow', 'ore_pulp_ph',
            'starch_flow', 'total_air_flow', 'total_column_level',
        },
        values=(...),
    ),
    problem_class=StatefulOptimizationProblem,
    problem_kwargs={'sense': 'minimize'},
    model_registry={
        'silica_conc_model',
    },
)

Now we can use problem factory to produce problems. Let's try creating a problem for the first row of the dataset. To do that we call `create_problem` of the problem factory instance.

### Create a problem and its domain using problem factory

In [17]:
problem = problem_factory.create(row_to_optimize)
problem

<optimizer.problem.stateful_problem.StatefulOptimizationProblem at 0x7faff97e9240>

## `SolverFactory`

This class simplifies the process of creating a solver-stopper pair for each of the problems.

To init `SolverFactory` we need to provide:
* controlled parameters config – this config is required since solver needs to know the domain of optimized parameters
* solver class and its kwargs – the solver class to create and its kwargs
* (optional) stopper class and its kwargs – the stopper class to create (defines criteria for early solver termination) and its kwargs; see the [OAI `optimizer` docs for more](https://brix.quantumblack.com/products/optimus/docs/src/packages/optimizer/docs/source/04_user_guide/05_stopper.html)
* (optional) domain class and its kwargs – the DomainGenerationBase implementation; is used for solver's domain generation ([read more in FAQ section](https://brix.quantumblack.com/products/optimus/docs/src/packages/recommend/src/recommend/notebooks/recommend.html#FAQ))

The factory instance will create a requested solver-stopper pair via `create` method.

For this example, we'll create a factory which will produce `DifferentialEvolutionSolver`  and `NoImprovementStopper` that halts the optimization after a certain number of iterations without an improvement. See [solver](https://brix.quantumblack.com/products/optimus/docs/src/packages/optimizer/docs/source/04_user_guide/03_solver.html) and [stopper](https://brix.quantumblack.com/products/optimus/docs/src/packages/optimizer/docs/source/04_user_guide/05_stopper.html) tutorials for more options.

In [18]:
from recommend import SolverFactory
from optimizer.solvers import DifferentialEvolutionSolver
from optimizer.stoppers import NoImprovementStopper

# Sample kwargs for DifferentialEvolutionSolver
solver_kwargs = {
    "sense": "minimize",
    "seed": 0,
    "maxiter": 100,
    "mutation": [0.5, 1.0],
    "recombination": 0.7,
    "strategy": "best1bin",
}

# Sample kwargs for NoImprovementStopper
stopper_kwargs = {
    "patience": 10,
    "sense": "minimize",
    "min_delta": 0.1    
}

solver_factory = SolverFactory(
    controlled_parameters_config=controlled_parameters_config,
    solver_class=DifferentialEvolutionSolver,
    solver_kwargs=solver_kwargs,
    stopper_class=NoImprovementStopper,
    stopper_kwargs=stopper_kwargs,
)
solver_factory

SolverFactory(
    solver_class=DifferentialEvolutionSolver,
    solver_kwargs={
        'sense': 'minimize',
        'seed': 0,
        'maxiter': 100,
        'mutation': [0.5, 1.0],
        'recombination': 0.7,
        'strategy': 'best1bin',
    },
    stopper_class=NoImprovementStopper,
    stopper_kwargs={
        'patience': 10, 'sense': 'minimize', 'min_delta': 0.1,
    },
    domain_generator=BoundedLinearSpaceDomain(
        controlled_parameters=ControlledParametersConfig(
            keys={
                'amina_flow', 'ore_pulp_density', 'ore_pulp_flow', 'ore_pulp_ph',
                'starch_flow', 'total_air_flow', 'total_column_level',
            },
            values=(...),
        ),
    ),
)

Now we can create a solver-stopper pair for any row. Additionally, factory requires a list of active controls (if none are provided, solver will optimize all controls from previously provided `controlled_parameters_config`. We'll use `problem` to get a list of active controls for this row.

In [19]:
solver_factory.create(row_to_optimize, problem.optimizable_columns)



(<optimizer.solvers.continuous.differential_evolution.DifferentialEvolutionSolver at 0x7faff97e9c30>,
 <optimizer.stoppers.no_improvement.NoImprovementStopper at 0x7fafa8ad3fd0>)

<div class="alert alert-info">
<b>Note</b>
    
The output of solver creation contains warnings. It can be the case that a particular control variable is outside the range given in the tag dictionary. However, the domain space of the solver guarantees that the recommendations will always be in the desired range.

</div>

### Creating discrete solver

If you want to use one of discrete solvers (i.e. `GridSearchSolver`, `HillClimbingSolver`, `DiscreteSimulatedAnnealingSolver`) just pass new solver class to the factory. The factory will automatically take care of domain generation. See example below.

In [20]:
from optimizer.solvers import GridSearchSolver

# Example keyword arguments. See docs for more info.
grid_solver_kwargs = {"sense": "minimize", "seed": 0}

disc_solver_factory = SolverFactory(
    solver_class=GridSearchSolver, 
    solver_kwargs=grid_solver_kwargs,
    controlled_parameters_config=controlled_parameters_config,
)
disc_solver, _ = disc_solver_factory.create(row_to_optimize, problem.optimizable_columns)
disc_solver

INFO:recommend.domain_generator.discrete_domain_generator:Created an evenly spaced domain for starch_flow. Starting from 3000.0 with step 200.0. This domain consists of 6 points.
INFO:recommend.domain_generator.discrete_domain_generator:Created an evenly spaced domain for amina_flow. Starting from 450.0 with step 50.0. This domain consists of 5 points.
INFO:recommend.domain_generator.discrete_domain_generator:Created an evenly spaced domain for ore_pulp_flow. Starting from 400.0 with step 2.0. This domain consists of 6 points.
INFO:recommend.domain_generator.discrete_domain_generator:Created an evenly spaced domain for ore_pulp_density. Starting from 1.65 with step 0.1. This domain consists of 2 points.
INFO:recommend.domain_generator.discrete_domain_generator:Created an evenly spaced domain for total_air_flow. Starting from 1000.0 with step 100.0. This domain consists of 11 points.
INFO:recommend.domain_generator.discrete_domain_generator:Created an evenly spaced domain for total_colu



<optimizer.solvers.discrete.grid_search.GridSearchSolver at 0x7faff97e83d0>

Factory creates a discrete solver without any additional parameters passed to it. Using this factory, we can create a solver-stopper pair for any row in the dataset.

## `optimize`

The final step is to conduct optimization. To complete it, pass `problem_factory`, and `stopper_factory` we just created. 

The package has a `optimize` method for optimizing a dataset in parallel. This method will use factories to create a problem and solver-stopper. Which will then be used to derive a solution.

We can control the parallelism with the `n_jobs` keyword argument.

In [21]:
from recommend import optimize

solutions = optimize(
    df,
    problem_factory,
    solver_factory,
    n_jobs=-1,
)
solutions

INFO:recommend.optimize._optimize:Creating problem, solver, and stopper for each row


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done   2 tasks      | elapsed:    7.4s
[Parallel(n_jobs=-1)]: Done   9 tasks      | elapsed:    9.6s
[Parallel(n_jobs=-1)]: Done  16 tasks      | elapsed:   10.8s
[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed:   13.1s
[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:   15.0s
[Parallel(n_jobs=-1)]: Done  45 tasks      | elapsed:   16.9s
[Parallel(n_jobs=-1)]: Done  56 tasks      | elapsed:   19.2s
[Parallel(n_jobs=-1)]: Done  75 out of  81 | elapsed:   22.8s remaining:    1.8s
[Parallel(n_jobs=-1)]: Done  81 out of  81 | elapsed:   23.8s finished


Solutions(
    keys=[
        0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
        20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
        38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55,
        56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73,
        74, 75, 76, 77, 78, 79, 80,
    ],
    values=(...),
)

The `optimize` returns `Solutions` object which is a mapping from data's index into `Solution`. This object can export solution results to a dataframe using the  `to_frame` method.

This method produces an export table with all parameters compared before and after the optimization. By default, (if users do not mutate state parameters explicitly) only controlled parameters differ.

Table has the same index as initial dataset (that was provided to the `Optimizer`) and two level-columns:
* first level – initial dataset's column name
* "type" – either "initial" (before the optimization) or "optimized" (after the optimization).

Penalty and slack values are also displayed (see `_slack` and `_penalty` suffixied columns).

Note that any context variable has only "initial" column index.

In [22]:
df_solutions = solutions.to_frame()
df_solutions.head()



Unnamed: 0_level_0,timestamp,air_flow01,air_flow02,air_flow03,air_flow04,air_flow05,air_flow06,air_flow07,amina_flow,amina_flow,...,objective,objective,starch_and_amina_flow_penalty,starch_and_amina_flow_penalty,starch_and_amina_flow_slack,starch_and_amina_flow_slack,run_id,is_successful_optimization,uplift,ore_pulp_ph
type,initial,initial,initial,initial,initial,initial,initial,initial,initial,optimized,...,initial,optimized,initial,optimized,initial,optimized,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,optimized
0,2017-08-30 23:00:00,299.88293,299.682324,299.884393,299.617858,300.0,300.0,300.0,522.255576,477.280651,...,13.13186,11.261798,1.523677,0.0,,86.079461,a71312bf-5ace-4857-bef0-4b9f7f3932c7,True,-1.870062,
1,2017-08-31 02:00:00,299.947659,299.800402,299.931837,299.436669,300.0,300.0,300.0,492.844533,482.303442,...,15.009397,11.257896,0.0,0.0,477.090481,73.476038,95256216-dcdf-49fa-81bd-d8627181897d,True,-3.751501,
2,2017-08-31 05:00:00,299.296059,299.392591,299.929091,299.255481,300.0,300.0,300.0,455.151107,481.539637,...,14.508256,10.848755,0.0,0.0,510.355502,99.132547,f2c70158-dd56-4018-b41e-b04c8c1050ad,True,-3.6595,
3,2017-08-31 08:00:00,299.994928,299.894504,299.948689,299.074293,300.0,,300.0,484.799569,483.028604,...,11.175563,10.742631,0.0,0.0,254.098735,56.391049,bc39a032-04ba-450f-882c-388b5399a40c,True,-0.432932,
4,2017-08-31 11:00:00,299.916354,299.890941,299.889231,298.893105,300.0,300.0,298.207261,415.15145,469.504557,...,12.891734,10.988091,6.651729,0.0,,129.387653,4f3b3d6a-50e5-4fa8-b867-a121967655bc,True,-1.903643,


Quick reminder on multi-index usage:
* `df_solutions["amina_flow"]` returns the initial and optimized values for "amina_flow"
* `df_solutions[("amina_flow", "optimized")]` returns the optimized values for "amina_flow"

## FAQ

### Q: How do I use a model in optimization?

Problem definition might include model usage. Typical usage of the models includes:
* model for predicting the objective
* model for a multistep-objective:
    1. optimizer proposes controls, then first model (soft sensor model) predicts updated state parameters 
    2. and then another model (target model) calculates the final objective
* model to predicting the constraint:
    1. optimizer proposes controls
    2. model predicts some state parameters
    3. constraints use those to penalise our objective/repair our controls


<div class="alert alert-info">
<b>Note</b>
    

To provide models to the problem, one can use `model_registry` argument when creating an instance of oyur implementation of `ProblemFactoryBase`. This model registry simplifies models retrieval in objective/penalties/repairs via `self._model_registry` dictionary.

</div>

## Next steps

Learn how optimization results can be explained with [optimization explainer tutorial notebooks](./optimization_explainer.ipynb)