#### Infill NOTEBOOK

This notebook validates the Bayesian and non-Bayesian multi-objective adaptive infill strategies.

**Notes**: the tests are performed for analytical optimization problems from Charayron et al. [(1)](https://www.sciencedirect.com/science/article/pii/S1270963823005692?via%3Dihub).

In [None]:
import matplotlib.pyplot as plt

from aero_optim.mf_sm.mf_models import get_model, get_sampler, MultiObjectiveModel
from aero_optim.mf_sm.mf_infill import compute_pareto

from pymoo.indicators.igd import IGD
from pymoo.indicators.igd_plus import IGDPlus
from pymoo.problems import get_problem

from mf_functions import zdt1_hf, zdt1_lf, zdt2_hf, zdt2_lf
from main_mf_infill import bayesian_optimization, non_bayesian_optimization, run_NSGA2

Select the optimization problem to solve: ZDT1 or ZDT2

In [None]:
zdt = "zdt1"

Compute the analytical Pareto front

In [None]:
zdt_pareto = get_problem(zdt).pareto_front()

#### 1. Custom Bayesian infill strategy

The Bayesian infill input variables are:

- `seed` the random seed
- `dim` the dimension of the problem
- `n_lf` the number of initial low-fidelity samples to draw
- `n_hf` the number of initial high-fidelity samples to draw
- `n_iter` the number of infill steps
- `infill_lf_size` the number of low-fidelity samples to compute at each infill step
- `infill_pop_size` the population size of the sub-optimization executions
- `infill_nb_gen` the number of generations of the sub-optimization executions
- `bound` the DOE boundaries

**Note**: the low- / high-fidelity infill ratio is 10 to 1

In [None]:
seed = 123
dim = 6
n_lf = 12
n_hf = 6
n_iter = 10
infill_lf_size = 10
infill_pop_size = 20
infill_nb_gen = 50
bound = [0, 1]

Compute the NSGA-II Pareto front for the given problem

In [None]:
zdt_hf = zdt1_hf if zdt == "zdt1" else zdt2_hf
zdt_lf = zdt1_lf if zdt == "zdt1" else zdt2_lf

zdt_problem = run_NSGA2(zdt_hf, dim, infill_pop_size, infill_nb_gen, bound, seed)
nsga_pareto = compute_pareto(zdt_problem.fitnesses[:, 0], zdt_problem.fitnesses[:, 1])

Builds the nested LHS sampler

In [None]:
mf_sampler = get_sampler(dim, bounds=[0, 1], seed=seed, nested_doe=True)
x_lf_DOE, x_hf_DOE = mf_sampler.sample_mf(n_lf, n_hf)
y_lf_DOE = zdt_lf(x_lf_DOE)
y_hf_DOE = zdt_hf(x_hf_DOE)

Builds the multi-objective co-kriging model

In [None]:
model1 = get_model(model_name="mfsmt", dim=dim, config_dict={}, outdir="", seed=seed)
model2 = get_model(model_name="mfsmt", dim=dim, config_dict={}, outdir="", seed=seed)
mfsmt = MultiObjectiveModel([model1, model2])
mfsmt.set_DOE(x_lf=x_lf_DOE, x_hf=x_hf_DOE, y_lf=[y_lf_DOE[:, 0], y_lf_DOE[:, 1]], y_hf=[y_hf_DOE[:, 0], y_hf_DOE[:, 1]])
mfsmt.train()

Bayesian adaptive infill loop

**Note**: this should take around 1.5 minutes

In [None]:
bayesian_optimization(mfsmt, zdt_lf, zdt_hf, n_iter, infill_lf_size, infill_nb_gen, True, dim, bound, seed)

Compute dataset Pareto

In [None]:
mfsmt_pareto = compute_pareto(mfsmt.models[0].y_hf_DOE, mfsmt.models[1].y_hf_DOE)

#### 2. Custom non-Bayesian infill strategy

The non-Bayesian infill input variables are the same as for the Bayesian inputs.

**Note**: the initial DOEs are the same.

Builds the MFDNN multi-output model

In [None]:
mfdnn_config = {
    "mfdnn": {
        "nested_doe": True,
        "pretraining": True,
        "NNL": {
            "layer_sizes_NNL": [32, 32, 32, 32, 32, 32, 2],
            "optimizer": {
                "lr": 1e-3,
                "weight_decay": 0
            },
            "loss_target": 1e-5,
            "niter": 10000
        },
        "NNH": {
            "layer_sizes_NNH1": [16, 2],
            "layer_sizes_NNH2": [16, 16, 2],
            "optimizer": {
                "lr": 1e-4,
                "weight_decay_NNH1": 0,
                "weight_decay_NNH2": 1e-4
            },
            "loss_target": 1e-5,
            "niter": 20000
        }
    }
}

mfdnn = get_model(model_name="mfdnn", dim=dim, config_dict=mfdnn_config, outdir="test", seed=seed)
mfdnn.set_DOE(x_lf=x_lf_DOE, x_hf=x_hf_DOE, y_lf=y_lf_DOE, y_hf=y_hf_DOE)
mfdnn.train()

Non-Bayesian adaptive infill loop

**Note**: this should take around 3.3 minutes

In [None]:
non_bayesian_optimization(mfdnn, zdt_lf, zdt_hf, n_iter, infill_lf_size, infill_nb_gen, infill_pop_size, dim, bound, seed)

Compute dataset Pareto

In [None]:
mfdnn_pareto = compute_pareto(mfdnn.y_hf_DOE[:, 0], mfdnn.y_hf_DOE[:, 1])

Adaptive infill results are plotted

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
ax.plot(zdt_pareto[:, 0], zdt_pareto[:, 1], color="r", label="true Pareto", zorder=-1)
ax.scatter(nsga_pareto[:, 0], nsga_pareto[:, 1], marker="x", color="k", label="NSGA2 Pareto")
ax.scatter(mfsmt.models[0].y_hf_DOE[:n_hf], mfsmt.models[1].y_hf_DOE[:n_hf], marker="s", color="k", label="initial DOE")
ax.scatter(mfsmt.models[0].y_hf_DOE[n_hf:], mfsmt.models[1].y_hf_DOE[n_hf:], marker="^", color="blue", label="mfsmt hf infills")
ax.scatter(mfdnn.y_hf_DOE[n_hf:, 0], mfdnn.y_hf_DOE[n_hf:, 1], marker="v", color="green", label="mfdnn hf infills")
ax.set(xlabel='$J_1$', ylabel='$J_2$')
plt.legend()

Performance indicator IGD and IGD+

In [None]:
igd = IGD(zdt_pareto)
print(f"IGD MFSMT: {igd(mfsmt_pareto)}\nIGD MFDNN: {igd(mfdnn_pareto)}\n")

igdp = IGDPlus(zdt_pareto)
print(f"IGD+ MFSMT: {igdp(mfsmt_pareto)}\nIGD+ MFDNN: {igdp(mfdnn_pareto)}")