# Maintaining a batch of populations using the functional EvoTorch API

The pure functional algorithm implementations in the namespace `evotorch.algorithms.functional` allow arbitrary batching in terms of starting point, and also in terms of some of the hyperparameters.

In this notebook, we demonstrate how a batch of populations, each originated from a different starting point, can be maintained so that different regions of the search space can be explored simultaneously.

We begin by importing the necessary libraries:

In [None]:
from evotorch.algorithms.functional import cem, cem_ask, cem_tell, pgpe, pgpe_ask, pgpe_tell
from evotorch.decorators import rowwise

from datetime import datetime
import torch
from math import pi

Now we define the fitness function.

Notice how the fitness function below is decorated via `@rowwise`. This decorator allows us to define the function from the perspective that its argument `x` is a vector (i.e. a 1-dimensional tensor). The decorator `@rowwise` ensures the following behaviors:

- if the argument `x` is indeed received as a 1-dimensional tensor, the function works as how it is defined;
- if the argument `x` is received as a matrix (i.e. as a 2-dimensional tensor), the operations of the function are applied for each row of the matrix;
- if the argument `x` is received as a tensor with 3 or more dimensions, the operations of the function are applied for each row of each matrix.

Thanks to this, the fitness function `rastrigin` can be used as it is to evaluate a single solution (represented by a 1-dimensional tensor), a single population (represented by a 2-dimensional tensor), or a batch of populations (represented by a tensor with 3 or more dimensions).

In [None]:
@rowwise
def rastrigin(x: torch.Tensor) -> torch.Tensor:
    [n] = x.shape
    A = 10.0
    return A * n + torch.sum((x ** 2) - (A * torch.cos(2 * pi * x)))

In this example, we consider the following batch size:

In [None]:
batch_size = 4
batch_size

Let us generate a batch of starting points.

In [None]:
solution_length = 1000

starting_points = ((torch.rand(batch_size, solution_length) * 2) - 1) * 5.12
starting_points.shape

For both functional `cem` and functional `pgpe`, the hyperparameter `stdev_max_change` can be given as a scalar (which then will be expanded to a vector), or as a vector (which then will be used as it is), or as a batch of vectors (which will mean that for each batch item `i`, the `i`-th `stdev_max_change` vector will be used).

Since we consider a batch of populations in this example, let us make a batch of `stdev_max_change` vectors, meaning that each population is to be maintained with its own different `stdev_max_change` hyperparameter.

In [None]:
smallest_stdev_max_change = 0.01
largest_stdev_max_change = 0.2

stdev_max_change = (
    smallest_stdev_max_change + (
        torch.arange(batch_size) * (
            (largest_stdev_max_change - smallest_stdev_max_change) / (batch_size - 1)
        )
    ).reshape(batch_size, 1) * torch.ones(batch_size, solution_length)
)

print("stdev_max_change:")
print("    ", type(stdev_max_change).__name__, "([", sep="")
for i in range(len(stdev_max_change)):
    print("        [%.4f, ...]," % float(stdev_max_change[i, 0]))
print("    ])")

In [None]:
cem_state = cem(
    # We want to minimize the evaluation results
    objective_sense="min",

    # The batch of vectors `starting_points` is given as our `center_init`,
    # that is, the center point(s) of the initial search distribution(s).
    center_init=starting_points,

    # The standard deviation of the initial search distribution(s).
    stdev_init=10.0,

    # We provide our batch of hyperparameter vectors as `stdev_max_change`.
    stdev_max_change=stdev_max_change,

    # Solutions belonging to the top half (top 50%) of the population(s)
    # will be chosen as parents.
    parenthood_ratio=0.5,
)

cem_state.center.shape

Below is the main loop of the evolutionary search.

In [None]:
# We will run the evolutionary search for this many generations:
num_generations = 1500

# Interval (in seconds) for printing the status:
report_interval = 3
last_report_time = datetime.now()

for generation in range(1, 1 + num_generations):
    # Get a population from the evolutionary algorithm
    population = cem_ask(cem_state, popsize=500)

    # Compute the fitnesses
    fitnesses = rastrigin(population)

    # Inform the evolutionary algorithm of the fitnesses and get its next state
    cem_state = cem_tell(cem_state, population, fitnesses)

    # If it is time to report, print the status
    tnow = datetime.now()
    if ((tnow - last_report_time).total_seconds() > report_interval) or (generation == num_generations):
        print("generation:", generation, "mean fitnesses:", torch.mean(fitnesses, dim=-1))
        last_report_time = tnow

Here are the center points found by `cem`:

In [None]:
cem_state.center

Let us now consider the functional `pgpe` algorithm.
For `pgpe`, `center_learning_rate` is a hyperparameter which is expected as a scalar in the non-batched case.
If it is provided as a vector, this means that for each batch item `i`, the `i`-th value of the `center_learning_rate` vector will be used.

Let us build a `center_learning_rate` vector:

In [None]:
smallest_center_lr = 0.001
largest_center_lr = 0.4

center_learning_rate = smallest_center_lr + (
    torch.arange(batch_size) * (
        (largest_center_lr - smallest_center_lr) / (batch_size - 1)
    )
)

center_learning_rate

Now we prepare the first state of our `pgpe` search:

In [None]:
pgpe_state = pgpe(
    # We want to minimize the evaluation results.
    objective_sense="min",

    # The batch of vectors `starting_points` is given as our `center_init`,
    # that is, the center point(s) of the initial search distribution(s).
    center_init=starting_points,

    # Standard deviation for the initial search distribution(s):
    stdev_init=10.0,

    # We provide our `center_learning_rate` batch here:
    center_learning_rate=center_learning_rate,

    # Learning rate for the standard deviation(s) of the search distribution(s):
    stdev_learning_rate=0.1,

    # We use the "centered" ranking where the worst solution is ranked -0.5,
    # and the best solution is ranked +0.5:
    ranking_method="centered",

    # We use the ClipUp optimizer.
    optimizer="clipup",

    # Just like how we provide a batch of `center_learning_rate` values,
    # we provide a batch of `max_speed` values for ClipUp:
    optimizer_config={"max_speed": center_learning_rate * 2},

    # Maximum relative change allowed for standard deviation(s) of the
    # search distribution(s):
    stdev_max_change=0.2,
)

Below is the main loop of the evolutionary search.

In [None]:
# We will run the evolutionary search for this many generations:
num_generations = 1500

# Interval (in seconds) for printing the status:
report_interval = 3
last_report_time = datetime.now()

for generation in range(1, 1 + num_generations):
    # Get a population from the evolutionary algorithm
    population = pgpe_ask(pgpe_state, popsize=500)

    # Compute the fitnesses
    fitnesses = rastrigin(population)

    # Inform the evolutionary algorithm of the fitnesses and get its next state
    pgpe_state = pgpe_tell(pgpe_state, population, fitnesses)

    # If it is time to report, print the status
    tnow = datetime.now()
    if ((tnow - last_report_time).total_seconds() > report_interval) or (generation == num_generations):
        print("generation:", generation, "mean fitnesses:", torch.mean(fitnesses, dim=-1))
        last_report_time = tnow

Here are the center points found by `pgpe`:

In [None]:
pgpe_state.optimizer_state.center