In [1]:
import os
os.chdir("../../..")

# Quickstart

In this tutorial, for computation time issues (e.g. training a neural network is time consumming), we will use a small time series forecasting task from the **GluonTS** package, called *m1_monthly*.


More applications are detailed here: [Examples](../Examples/index.rst)

### Loading the dataset

In [11]:

from dragon.experiments.monash_archive.dataset import gluonts_dataset
from dragon.experiments.monash_archive.datasets_configs import m1_monthly_config

# Remove unnecessary logs from imported packages
import logging

log_gluonts = logging.getLogger("gluonts")
log_gluonts.setLevel('CRITICAL')
log_mxnet = logging.getLogger("pytorch_lightning")
log_mxnet.setLevel('CRITICAL')
log_dragon = logging.getLogger("")
log_dragon.setLevel('CRITICAL')

train_ds, test_ds, config = gluonts_dataset(m1_monthly_config)
config['SaveDir'] = config['PathName'] + "_"

### Defining the Loss Function

The `loss function` measures how well our Deep Neural Network performed on the considered task. This function should handle the DNN training procedure as well as the metric computation on the validation set. The user is required to define a model, a training and a testing procedure.

Then, we use a wrapping function called `zellij.core.Loss` from **Zellij**. By setting the argument `MPI` to **True**, one can use the distributed version of the `Loss` object.


##### DNN definition

The class `dragon.experiments.monash_archive.training.GluontsNet`, designed specially for the **GluonTS** forecasting datasets handles the DNN creation, its training and testing procedure.

In [12]:
from dragon.experiments.monash_archive.training import GluontsNet
m1_monthly_config['NumEpochs'] = 20
model = GluontsNet(train_ds, test_ds, m1_monthly_config)

##### Loss function definition

In [13]:

from zellij.core import Loss    

loss = Loss(MPI=False, verbose=False, save=True)(model.get_nn_forecast)

### Search Space definition

To define a searchspace one need to define variables `var`, which would be optimized.

The DNN are modelized by **AdjMatrix** (`dragon.search_space.dags.AdjMatrixVariable`). They are parametrized by a set of candidate operations. Each candidate operations are modelized by an **ArrayVar** (`zellij.core.ArrayVar`) containing the operation name and the associated *hyperparameters*. They can be of type:

* **Floats**: `zellij.core.FloatVar`, e.g: learning rate, dropout rate, etc.
* **Integers**: `zellij.core.IntVar`, e.g: output dimension, kernel size, etc.
* **Categorical**: `zellij.core.CatVar`, e.g: activation function, pooling type, etc.

Typical candidate operations variables are already defined within the package: `dragon.search_space.variables`. They are based on `nn.Module` defined in the `dragon.search_space.bricks` repository.

In [14]:
from zellij.core.variables import CatVar, ArrayVar, DynamicBlock
from zellij.utils.neighborhoods import ArrayInterval, DynamicBlockInterval

from dragon.search_algorithm.neighborhoods import LayersInterval, AdjMatrixInterval
from dragon.search_space.dags import AdjMatrixVariable
from dragon.search_space.variables import unitary_var, mlp_var, activation_var, create_int_var

# We define the candidate operations for each nodes in the graph. Here we only consider multi-layers perceptron and identity operations.
def operations_var(label, shape, size):
    return DynamicBlock(
        label,
        CatVar(
            label + "Candidates",
            [
                unitary_var(label + " Unitary"),
                mlp_var(label + " MLP"),
            ],
            neighbor=LayersInterval([2, 1]),
        ),
        size,
        neighbor=DynamicBlockInterval(neighborhood=2),
    )

# We define the serach space, a graph handling one-dimensional data, and the final activation function before the prediction.
def NN_monash_var(label="Neural Network", shape=1000, size=10):
    NeuralNetwork = ArrayVar(
        AdjMatrixVariable(
            "Cell",
            operations_var("Feed Cell", shape, size),
            neighbor=AdjMatrixInterval()
        ),
        activation_var("NN Activation"),
        create_int_var("Seed", None, 0, 10000),
        label=label,
        neighbor=ArrayInterval(),
    )
    return NeuralNetwork

sp = NN_monash_var(shape=m1_monthly_config["Lag"], size=3)

Once your search space is defined, you can draw random points:

In [15]:
p1,p2 = sp.random(), sp.random()
print("First random point: ", p1)
print("Second random point: ", p2)


First random point:  [NODES: [['Input'], ['concat', 'MLP', 301, 'swish'], ['mul', 'MLP', 164, 'id']] | MATRIX:[[0, 1, 1], [0, 0, 1], [0, 0, 0]], 'sigmoid', 5684]
Second random point:  [NODES: [['Input'], ['mul', 'Identity'], ['mul', 'MLP', 417, 'softmax']] | MATRIX:[[0, 1, 0], [0, 0, 1], [0, 0, 0]], 'gelu', 8854]


Now we can use the loss function on the search space:

In [16]:
scores = loss([p1, p2])
print("\n")
print(f"Best solution found:\n  {loss.best_point} \n       = {loss.best_score}")
print(f"Number of evaluations:{loss.calls}")
print(f"All evaluated solutions:{loss.all_solutions}")
print(f"All loss values:{loss.all_scores}")

Global seed set to 5684
Global seed set to 8854




Best solution found:
  [NODES: [['Input'], ['concat', 'MLP', 301, 'swish'], ['mul', 'MLP', 164, 'id']] | MATRIX:[[0, 1, 1], [0, 0, 1], [0, 0, 0]], 'sigmoid', 5684] 
       = 1.280236
Number of evaluations:2
All evaluated solutions:[[NODES: [['Input'], ['concat', 'MLP', 301, 'swish'], ['mul', 'MLP', 164, 'id']] | MATRIX:[[0, 1, 1], [0, 0, 1], [0, 0, 0]], 'sigmoid', 5684], [NODES: [['Input'], ['mul', 'Identity'], ['mul', 'MLP', 417, 'softmax']] | MATRIX:[[0, 1, 0], [0, 0, 1], [0, 0, 0]], 'gelu', 8854]]
All loss values:[1.280236, 1.40858]


### Implementing an optimization strategy

To ease the use of several metaheuristics, the user can directly use the function `evodags.search_algorithm.pb_configuration.problem_configuration` to define its search strategy.

In our case we will use an Evolutionary Algorithm, we set the *MetaHeuristic* entry from the config to **GA**.


In [17]:
import time
from dragon.search_algorithm.pb_configuration import problem_configuration
  
exp_config = {
    "MetaHeuristic": "GA",
    "Generations": 2,
    "PopSize": 4,
    "MutationRate": 0.7,
    "TournamentRate": 10,
    "ElitismRate": 0.1,
    "RandomRate": 0.1,
    "Neighborhood": "Full"
}

_, search_algorithm = problem_configuration(exp_config, sp, loss)

start_time = time.time()
best, score = search_algorithm.run()
end_time = time.time() - start_time
print(f"Best solution found:\nf({best}) = {score},\ncomputation time: {np.round(end_time,2)} seconds")

Global seed set to 7546
Global seed set to 7965
Global seed set to 436
Global seed set to 1612
Global seed set to 8909
Global seed set to 3208
Global seed set to 3039
Global seed set to 8909


Best solution found:
f([[NODES: [['Input'], ['mul', 'Identity'], ['mul', 'Identity']] | MATRIX:[[0, 1, 1], [0, 0, 1], [0, 0, 0]], 'gelu', 7546]]) = [1.217747],
computation time: 152.39 seconds
