In [None]:
%matplotlib inline

# Post-process an optimization dataset.


Optimization problems can be exported as an [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset], which can
later be stored as different file types, such as `csv`. One might later want to
recover and visualize the stored data. Fortunately, GEMSEO allows one to use different
post-processing algorithms with an [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset] as input.



In this example, we demonstrate how to use an  [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset] for
post-processing. The data used are from an MDO scenario on the
[Power2][gemseo.problems.optimization.power_2.Power2] problem.
The data has been saved in an HDF5 file.
The post-processing algorithm used for this example is the [OptHistoryView][gemseo.post.opt_history_view.OptHistoryView].



In [None]:
from __future__ import annotations

import numpy as np

from gemseo import execute_post
from gemseo.algos.constraint_tolerances import ConstraintTolerances
from gemseo.algos.design_space import DesignSpace
from gemseo.algos.optimization_problem import OptimizationProblem
from gemseo.datasets.optimization_dataset import OptimizationDataset
from gemseo.datasets.optimization_metadata import OptimizationMetadata
from gemseo.settings.post import OptHistoryView_Settings

First we will recover the use case data from an HDF5 file, and convert it into an
[OptimizationProblem][gemseo.algos.optimization_problem.OptimizationProblem].



In [None]:
problem = OptimizationProblem.from_hdf("power2_opt_pb.h5")

Now the problem gets converted into an [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset]



In [None]:
dataset = problem.to_dataset(group_functions=True)

As you can see, the argument `group_functions` must be `True` in order to use the
post-processing, otherwise, the different functions won't be grouped to their
corresponding optimization function (objective, inequality constraints, equality
constraints, observables).



Now we can execute the post-processing as usual. The only difference is that, instead
of passing a [BaseScenario][gemseo.scenarios.base_scenario.BaseScenario] or an HDF5 file as an argument, we pass the
[OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset].



In [None]:
execute_post(
    dataset,
    settings_model=OptHistoryView_Settings(
        save=False,
        show=True,
    ),
)

The advantage of converting an [OptimizationProblem][gemseo.algos.optimization_problem.OptimizationProblem] to an
[OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset] is that there's no manipulation to be done to
post-process. As you can tell, there's no significant advantage with doing this
instead of post-processing the problem directly.



The advantage of being able to use an [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset] is that it allows
to use GEMSEO post-processing algorithms using any data. To illustrate this, we recover
the data from a csv file instead of an HDF5 file.
First, we will save the previous data in a csv file.



In [None]:
dataset.to_csv("results.csv")

We can build directly the [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset] from the csv file.



In [None]:
recovered_dataset = OptimizationDataset.from_csv("results.csv")

!!! note

    Since the data recovered from the csv comes from an existing
    [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset], the variables are already grouped. Details on how to
    group the variables in case of importing ungrouped data can be found
    in [this example][dataset-manipulation].



In order to use an [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset] we must attribute some optimization
metadata to the [OptimizationDataset][gemseo.datasets.optimization_dataset.OptimizationDataset]. For this we use the
[OptimizationMetadata][gemseo.datasets.optimization_metadata.OptimizationMetadata]
and store it in the attribute [misc][gemseo.datasets.optimization_dataset.OptimizationDataset.misc] of the
dataset under the key `"optimization_metadata"`.
Some optimization metadata can be recovered from the dataset itself, but overall,
it requires to have knowledge of the problem.



The field `output_names_to_constraint_names` makes reference to the cases where the
names of functions were changes for a reason or another (like an offset for example).
The argument takes the shape of a dictionary where the keys are the original
constraint names and the value a list of associated names. For the use case at hand,
there is no name change so the associated constraint names are the names themselves.



In [None]:
output_names_to_constraint_names = {
    constraint_name: constraint_name
    for constraint_name in recovered_dataset.inequality_constraint_names
    + recovered_dataset.equality_constraint_names
}

The optimum iteration can be retrieved from the dataset by looking for the minimum
value of the objective function.



In [None]:
optimum_iteration = recovered_dataset.objective_dataset.idxmin(axis=0).values[0]

The tolerances field is an instance of the [ConstraintTolerances][gemseo.algos.constraint_tolerances.ConstraintTolerances] model.
Which must be instantiated with the corresponding values. In this case the default
values are used.



In [None]:
tolerances = ConstraintTolerances()

The last important data to be determined is the point feasibility. This can be
predetermined and stored in the csv file. In this case, we determine the feasibility
using the tolerances to create a mask.



In [None]:
equality_feasible_mask = (
    np.abs(recovered_dataset.equality_constraint_dataset) <= tolerances.equality
).all(axis=1)

inequality_feasible_mask = (
    np.abs(recovered_dataset.inequality_constraint_dataset) <= tolerances.inequality
).all(axis=1)

feasible_iterations = recovered_dataset.index[
    equality_feasible_mask & inequality_feasible_mask
].tolist()

With all the optimization metadata ready, we can create the
[OptimizationMetadata][gemseo.datasets.optimization_metadata.OptimizationMetadata]
and attribute it to the dataset.



In [None]:
opt_metadata = OptimizationMetadata(
    objective_name="pow2",
    standardized_objective_name="pow2",
    minimize_objective=True,
    use_standardized_objective=False,  # Either True or False according to the user
    tolerances=ConstraintTolerances(),  # Add the corresponding tolerances to the pydantic model
    output_names_to_constraint_names=output_names_to_constraint_names,
    feasible_iterations=feasible_iterations,
    optimum_iteration=optimum_iteration,
)

recovered_dataset.misc["optimization_metadata"] = opt_metadata

Given that some post-processing algorithms use the input space of the problem,
attributing the input space of the problem to the dataset can be useful.
For the [Power2][gemseo.problems.optimization.power_2.Power2] problem we know that the input space is $-1.0 < x < 1.0$ where
`x` has 3 components and has initiated with 1.0.



In [None]:
input_space = DesignSpace()
input_space.add_variable("x", 3, lower_bound=-1.0, upper_bound=1.0, value=1.0)

recovered_dataset.misc["input_space"] = input_space

With all the optimization metadata gathered, we can execute the post-processing.



In [None]:
execute_post(
    recovered_dataset,
    settings_model=OptHistoryView_Settings(
        save=False,
        show=True,
    ),
)

!!! warning

    The post-processing algorithm [GradientSensitivity][gemseo.post.gradient_sensitivity.GradientSensitivity] has an option to compute
    missing gradients. It is not possible to use an OptimizationDataset with that option.


