<h1 style="text-align: center;">Solving use case [SUC]: Solving a problem with MetaGen</h1>

# $P_4$ problem
Domain:
$$Learning\;rate \models Def^{R} = \langle 0.0, 0.000001\rangle \\ Ema \models Def^{C} = \{True, False\} \\ Arch \models Def^{D} = \langle 2,10,\,Def^{G} = \{Neurons \models Def^{I} = \langle 25, 300\rangle, Activation \models Def^{C} = \{relu, sigmoid, softmax, tanh\}, Dropout \models Def^{R} = \langle 0.0, 0.45\rangle\}\rangle$$

Fitness function:
$$LSTM(Learning\;rate, Ema, Arch)$$

This problem aims to find the best neural network architecture and hyperparameters for a regression model based on specific data.

In [None]:
%pip install pymetagen-datalabupo

In [1]:
from metagen.framework import Domain, Solution
from metagen.metaheuristics import RandomSearch

In accordance with the domain description of $P_4$, two general variables are defined: the $REAL$ variable `learning_rate`, defined with the `define_real` method, which falls in the interval $[0.0, 0.000001]$, and the $CATEGORICAL$ variable `ema`, which can take either `True` or `False` as defined by the `define_categorical` method. These two variables control the optimization algorithm in the neural network training process.

The architecture of the network is controlled by the `arch` variable, defined with the `define_dynamic_structure` method and having a size that varies between $2$ and $10$. The type of the `arch`'s components is defined as $GROUP$ with the `define_group` method and is named `layer`. Then, each element of the `layer` variable is defined as follows: first, the $INTEGER$ element `neurons`, which falls in the interval $[25, 300]$, is defined with the `define_integer_in_group`. Similarly, the $CATEGORICAL$ element `activation` is defined with the `define_categorical_in_group` method and can take the values `relu`, `sigmoid`, `softmax`, or `tanh`. And, the $REAL$ element `dropout`, which falls in the interval $[0.0, 0.45]$, is defined with the `define_real_in_group` method. Finally, the $GROUP$ variable `layer` is linked to the `arch` $DYNAMIC$ structure by means of the `set_structure_to_variable` method.

In [2]:
p4_domain = Domain()
p4_domain.define_real("learning_rate", 0.0, 0.000001)
p4_domain.define_categorical("ema", [True, False])
p4_domain.define_dynamic_structure("arch", 2, 10)
p4_domain.define_group("layer")
p4_domain.define_integer_in_group("layer", "neurons", 25, 300)
p4_domain.define_categorical_in_group("layer", "activation", ["relu", "sigmoid", "softmax", "tanh"])
p4_domain.define_real_in_group("layer", "dropout", 0.0, 0.45)
p4_domain.set_structure_to_variable("arch", "layer")

The fitness function must perform three main tasks: build the neural network architecture, train the network with the target dataset, and evaluate the model with new instances. The function returns the mean absolute percentage error ($MAPE$) of the evaluation as its result. The neural network model with the lowest $MAPE$ will be considered the best.

According to the first task, the function `build_neural_network` builds the neural network architecture taking into account the hyperparameters controlled by `Solution` and using the `tensorflow and keras packages`.

Firstly, the base `tensorflow model` is built with the `Sequential` class, then the `LSTM` layers are added in a loop.

To do so, the $DYNAMIC$ variable `arch` is gone through by means of a `for` loop as a `Python` `list`; in this case, `enumerate` function is used to get an `i` index of each loop. Then, for each component of the `arch` variable, the `neurons`, `activation`, and `dropout` values are retrieved using the bracket `Python` operator passing the variable of the $GROUP$ (`layer`). An `LSTM` layer with the `neurons` and `activation` parameters, together with a `Dropout` layer with the `dropout` parameter, are added to the model on each iteration.

Finally, a `Dense` layer is added to obtain the desired output. The model is compiled using the `Adam` optimizer with the `learning_rate` and `ema` values obtained from the `Solution` object by using the bracket `Python` operator.

Finally, the function returns the neural network.

MAC OSX note: the GPU optimization must be disabled since the tensorflow-metal plugin currently does not support exponential moving average (EMA)

In [3]:
import tensorflow as tf
tf.config.set_visible_devices([], "GPU")

In [4]:
def build_neural_network(solution: Solution) -> tf.keras.Sequential():
    # Architecture building
    model = tf.keras.Sequential()

    for i, layer in enumerate(solution["arch"]):
        neurons = layer["neurons"]
        activation = layer["activation"]
        dropout = layer["dropout"]
        rs = True
        if i == len(solution["arch"]):
            rs = False
        model.add(tf.keras.layers.LSTM(neurons, activation=activation, return_sequences=rs))
        model.add(tf.keras.layers.Dropout(dropout))
    model.add(tf.keras.layers.Dense(1, activation="tanh"))
    # Model compilation
    learning_rate = solution["learning_rate"]
    ema = solution["ema"].value
    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate, use_ema=ema),
                  loss="mean_squared_error", metrics=[tf.keras.metrics.MAPE])
    return model

Prior to coding the fitness function, the dataset is generated using the make_regression method from the sklearn package and then normalized to improve the training process for the neural network. The dataset is then split into two groups: one for training and one for validating the model.

In [5]:
from sklearn.datasets import make_regression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import numpy as np

scaler_x = StandardScaler()
scaler_y = StandardScaler()

x, y = make_regression(n_samples=1000, n_features=24)

xs_train, xs_val, ys_train, ys_val = train_test_split(
    x, y, test_size=0.33, random_state=42)

xs_train = scaler_x.fit_transform(xs_train)
ys_train = scaler_y.fit_transform(ys_train)
xs_val = scaler_x.transform(xs_val)
ys_val = scaler_y.transform(ys_val)

After generating the dataset, it must be reshaped from a two-dimensional array to a three-dimensional array to meet the specifications of the `tensorflow` `LSTM` class. This is achieved using the `numpy` `reshape` method.

In [6]:
x_train = np.reshape(xs_train, (xs_train.shape[0], xs_train.shape[1], 1))
y_train = np.reshape(ys_train, (ys_train.shape[0], 1))
x_val = np.reshape(xs_val, (xs_val.shape[0], xs_val.shape[1], 1))
y_val = np.reshape(ys_val, (ys_val.shape[0], 1))

The fitness function calls the `build_neural_network` method to obtain the architecture, fits the model with the training dataset, and evaluates the resulting model using the validation dataset. The evaluation step returns the $MAPE$ value, which is then returned by the fitness function.

In [7]:
def p4_fitness(solution: Solution) -> float:
    model = build_neural_network(solution)
    model.fit(x_train, y_train, epochs=10, batch_size=1024)
    mape = model.evaluate(x_val, y_val)[1]
    return mape

To summarize, the `p4_domain` and `p4_fitness` elements are passed to the `RandomSearch` metaheuristic, obtaining a hyperparameter solution for this problem by calling the `run` method.

In [8]:
p4_solution: Solution = RandomSearch(p4_domain, p4_fitness, search_space_size=5, iterations=2).run()

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
E

Finally, the `p4_solution` is printed.

In [9]:
print(p4_solution)

F = 99.71414184570312	{arch = ['F = 1.7976931348623157e+308\t{activation = tanh , dropout = 0.08558638183869635 , neurons = 218}', 'F = 1.7976931348623157e+308\t{activation = sigmoid , dropout = 0.20708328585614044 , neurons = 112}', 'F = 1.7976931348623157e+308\t{activation = tanh , dropout = 0.38348844257114484 , neurons = 210}', 'F = 1.7976931348623157e+308\t{activation = sigmoid , dropout = 0.33847986426610444 , neurons = 166}', 'F = 1.7976931348623157e+308\t{activation = sigmoid , dropout = 0.3828465841819473 , neurons = 137}', 'F = 1.7976931348623157e+308\t{activation = tanh , dropout = 0.23568220234904635 , neurons = 207}', 'F = 1.7976931348623157e+308\t{activation = sigmoid , dropout = 0.20107332338948886 , neurons = 34}', 'F = 1.7976931348623157e+308\t{activation = sigmoid , dropout = 0.17064890862009371 , neurons = 199}', 'F = 1.7976931348623157e+308\t{activation = sigmoid , dropout = 0.053820944444486235 , neurons = 232}'] , ema = True , learning_rate = 9.124808946416492e-07