## `Course evaluation information`

- Concisely note down your observations for each lab from now on.
- You can take notes inside the notebooks, or in separate PDFs.
- Either way, be ready to show the work you putted into each lab, including the experiments and the learning outcomes.
- During the exam, you will be asked to show your notes for some of the labs at random with a brief discussion on its content

# Lab. 1: Local search

## Introduction

#### <u>The goal is to study the application of local search algorithms on different benchmark functions.</u>

We will see the following methods:
- *Grid Search*
- *Random Search*
- *Powell*
- *Nelder Mead*

Moreover, we will study how their parameters change the behavior of these algorithms.

---

Getting started: the following code cell contains the core functions that we will use. Hence, **remember to run it every time the runtime is reconnected**.

It contains the three local search algorithms and a wrapper class called *OptFun* for the benchmark function.
As regards the *OptFun* class, the constructor takes as input a benchmark function (we will see later what functions are available). The relevant methods  are 4:
1.   *Minima*: return the minimum of the function. The position can be obtained by the parameter *position* and the function value from the *score* parameter.
2.   *Bounds*: returns where the function is defined
3.   *Heatmap*: show a heatmap of the function highlighting the points visited by the local search (use with 2d function)
4.   *plot*: show the best points find during the optmization process.

Each instance of *OptFun* stores the history of the point at which the function has been evaluated. The history is never cleaned and can be obtained through *OptFun.history*. Hence, if you reuse the class instance remember to clean the history (*OptFun.history = list()*).

---

The benchmark functions available comes from the *benchmark_functions* library (imported as *bf*).
Example of the functions that can be used are the *Hypersphere*, the *Rastrign* the *DeJong5* and the Keane.
The complete list of functions available can be found at this [link](https://gitlab.com/luca.baronti/python_benchmark_functions) or you can print it with *dir(bf)*.

#### Base code to run every time the runtime is reconnected

In [None]:
import inspect
from copy import deepcopy
from typing import Optional

import benchmark_functions as bf
import numpy as np
from matplotlib import pyplot as plt
from numpy.typing import NDArray
from scipy.optimize import OptimizeResult, minimize

plt.rcParams["figure.figsize"] = (22, 8)

In [None]:
class OptFun:
    def __init__(self, wf: bf.BenchmarkFunction) -> None:
        self.f = wf
        self.history: list[list[float]] = []

    def __call__(self, x0: list[float]) -> float:
        self.history.append(deepcopy(x0))
        return self.f(x0)  # type: ignore

    @property
    def name(self) -> str:
        return self.f.name()

    def minima(self) -> list[bf.fil.Optimum]:
        return self.f.minima()

    def bounds(self) -> list[tuple[float, float]]:
        return self._convert_bounds(self.f.suggested_bounds())

    def found_minimum(self) -> list[float]:
        minimum = self.history[0]
        for x in self.history:
            if self.f(x) < self.f(minimum):  # type: ignore
                minimum = x
        return minimum

    def plot(self, fn: Optional[str] = None) -> None:
        plt.clf()
        ax1: plt.Axes
        ax2: plt.Axes
        f, (ax1, ax2) = plt.subplots(1, 2)
        f.suptitle("Benchmark Function: " + self.name)

        # heatmap
        bounds_lower, bounds_upper = self.f.suggested_bounds()
        x = np.linspace(bounds_lower[0], bounds_upper[0], 100)
        if self.f.n_dimensions() > 1:
            y = np.linspace(bounds_lower[1], bounds_upper[1], 100)
            X, Y = np.meshgrid(x, y)
            Z = np.asarray(
                [
                    [self.f((X[i][j], Y[i][j])) for j in range(len(X[i]))]
                    for i in range(len(X))
                ]
            )
        else:
            raise ValueError("Function has only one dimension")
        ax1.contour(x, y, Z, 15, linewidths=0.5, colors="k")
        contour = ax1.contourf(x, y, Z, 15, cmap="viridis", vmin=Z.min(), vmax=Z.max())
        ax1.set_xlabel("x")
        ax1.set_ylabel("y")
        ax1.set_title("Heatmap")
        cbar = plt.colorbar(contour, ax=ax1)
        cbar.set_label("z")
        if len(self.history) > 0:  # plot points
            xdata = [x[0] for x in self.history]
            ydata = [x[1] for x in self.history]
            ax1.plot(xdata, ydata, "or-", markersize=2, linewidth=2)
            # plot function minimum
            minima = self.f.minima()[0]
            ax1.plot(minima.position[0], minima.position[1], "or", markersize=10)

        # convergence
        values = [self.f(v) for v in self.history]
        min: float = self.f.minima()[0].score  # type: ignore
        ax2.plot(values)
        ax2.axhline(min, color="r", label="optimum")
        ax2.set_xlabel("Iterations")
        ax2.set_ylabel("f(x)")
        ax2.set_title("Function Evaluation")
        ax2.legend()

        if fn is not None:
            plt.savefig(fn, dpi=400)
        plt.show()

    def _convert_bounds(
        self, bounds: tuple[list[float], list[float]]
    ) -> list[tuple[float, float]]:
        new_bounds: list[tuple[float, float]] = []
        for i in range(len(bounds[0])):
            new_bounds.append((bounds[0][i], bounds[1][i]))
        return new_bounds

    def current_calls(self):
        return len(self.history)


def grid_search(
    f: OptFun, step_size: Optional[float] = None, number_of_steps: Optional[int] = None
):
    """
    Optimizes a function by using the grid_search algorithm.

    - f: function to optimize, an instance of OptFun
    - step_size: the step size
    - number_of_steps: the total number of steps
    """
    bounds = f.bounds()
    if step_size is not None:
        for x in np.arange(bounds[0][0], bounds[0][1], step_size):
            for y in np.arange(bounds[1][0], bounds[1][1], step_size):
                f([x, y])
    elif number_of_steps is not None:
        for x in np.linspace(
            bounds[0][0], bounds[0][1], int(np.floor(np.sqrt(number_of_steps)))
        ):
            for y in np.linspace(
                bounds[1][0], bounds[1][1], int(np.floor(np.sqrt(number_of_steps)))
            ):
                f([x, y])
    else:
        print("Please provide at least the step_size or the number of steps")


def random_search(f: OptFun, n_samples_drawn: int):
    """
    Optimizes a function by using the random_search algorithm.

    - f: function to optimize, an instance of OptFun
    - number_of_steps: the total number of steps
    """
    bounds = f.bounds()
    for _ in range(n_samples_drawn):
        x = np.random.uniform(bounds[0][0], bounds[0][1])
        y = np.random.uniform(bounds[1][0], bounds[1][1])
        f([x, y])


def powell(
    f: OptFun,
    x0: list[float],
    maxiter: int,
    initial_directions: Optional[NDArray[np.float64]] = None,
) -> OptimizeResult:
    """
    Optimizes a function by using the Powell algorithm.

    - f: function to optimize, an instance of OptFun
    - x0: starting point for the search process
    - maxiter: maximum number of iterations
    """
    bounds = f.bounds()
    results: OptimizeResult = minimize(
        fun=f,
        x0=list(x0),
        method="powell",
        bounds=bounds,
        options={
            "ftol": 1e-4,
            "maxfev": None,
            "maxiter": maxiter,
            "direc": initial_directions,
            "return_all": True,
        },
    )
    return results


def nelder_mead(f: OptFun, x0: list[float], maxiter: int) -> OptimizeResult:
    """
    Optimizes a function by using the Nelder-Mead algorithm.

    - f: function to optimize, an instance of OptFun
    - x0: starting point for the search process
    - maxiter: maximum number of iterations
    """
    bounds = f.bounds()
    results = minimize(
        f,
        x0,
        method="Nelder-Mead",
        tol=None,
        bounds=bounds,
        options={
            "maxfev": None,
            "maxiter": maxiter,
            "disp": False,
            "return_all": True,
            "initial_simplex": None,
            "xatol": 0.000,
            "fatol": 0.000,
            "adaptive": False,
        },
    )
    return results

In [None]:
def printClassInitArgs(class_obj: bf.BenchmarkFunction):
    signature = inspect.signature(class_obj.__init__).parameters
    print("-------------------------------")
    for name, parameter in signature.items():
        print("Name: ", name, "\nDefault value:", parameter.default)
        # print("Annotation:", parameter.annotation, "\nKind:", parameter.kind)
        print("-------------------------------")

# Exercises

#### Solve the following exercises, and answer these questions at the end:

- How the benchmark functions influence the optimization algorithms? There is an algorithm which is always better than the other?
- The choiche of the parameters is influenced by the function to optimize? And how the algorithms are influenced by the parameters?

In [None]:
# BE AWARE: check the arguments each benchmark function takes and ignore the "opposite" argument
# if you're not sure, you can check the arguments by using the printClassInitArgs function
printClassInitArgs(bf.DeJong5())

printClassInitArgs(bf.Hypersphere())

## Exercise 1/4: GRID SEARCH
In this first exercise we will use grid search as a search algorithm

### Questions
- How does the step size influence the quality of the best point obtained?
- How does the step size influence the search cost?

In [None]:
benchmarks = [
    bf.Hypersphere(2),
    bf.Rastrigin(2),
    bf.Ackley(2),
]
num_steps = [10, 100, 1000]

for benchmark in benchmarks:
    for steps in num_steps:
        func = OptFun(benchmark)
        grid_search(func, number_of_steps=steps)

        print("Benchmark function: ", func.name)
        print("Num steps: ", steps)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot(f"imgs/gs_{func.name}_step_size_{steps}.png")

The more points we sample (tighter discretization), the more likely we are to find the global minimum. However, the cost of the search increases with the number of points sampled. Since we are doing a global search, there is no basin of attraction to exploit which can be a good thing to avoid local minima, but also a bad thing because we are not exploiting any information on the function itself.

Moreover, given we are equally sampling the space, the shape of the function is not important (there is no harder function to optimize wrt another), the only additional complexity is given by a more expensive function to evaluate.

## Exercise 2/4: RANDOM SEARCH

In this exercise we will use Random Search to search for the optimum

### Questions
- How does the number of samples drawn affect the search?
- How does this method compare to Grid Search? What are the advantages and disadvantages?

In [None]:
benchmarks = [
    bf.Hypersphere(2),
    bf.Rastrigin(2),
    bf.Rosenbrock(2),
    bf.Ackley(2),
    bf.DeJong5(),
    bf.Keane(2),
]
n_samples = [10, 100, 500]

for benchmark in benchmarks:
    for n_samples_drawn in n_samples:
        func = OptFun(benchmark)
        random_search(func, n_samples_drawn)

        print("Benchmark function: ", func.name)
        print("Number of samples: ", n_samples_drawn)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot(f"imgs/rs_{func.name}_samples_{n_samples_drawn}.png")

Random search is a stochastic method, hence it can happen that sampling more points does not guarantee a better result. However, the more points we sample, the more likely we are to find the global minimum. The cost of the search increases with the number of points sampled.

Similarly to grid search, the shape of the function is not important


## Exercise 3/4: POWELL OPTIMIZATION

In this exercise we will focus on the Powel optimization algorithm.

### Questions
- What happens when varying the parameters of the algorithm?
- How they influence the optimization process?
- The effects of these parameters is the same across different functions?
- How does this algorithm compare to the previous?

In [None]:
benchmark = bf.Hypersphere(2)
initial_points = [[4.0, -4.0], [1.0, 1.5], [0.5, -0.5]]
max_iters = [1, 10, 100]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

In [None]:
benchmark = bf.Rastrigin(2)
initial_points = [[4.0, -4.0], [1.0, 1.5], [0.5, -0.5]]
max_iters = [1, 10, 100]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

In [None]:
func = OptFun(bf.Ackley(2))
axis = [np.eye(2), np.array([[-1, 0], [0, -1]])]
initial_point = [-10.0, -10.0]
iters = 100

for directions in axis:
    powell(func, initial_point, iters, directions)
    print("Benchmark function: ", func.name)
    print("Initial point: ", initial_point)
    print("Max iterations: ", iters)
    print("Real minimum: ", func.minima()[0])
    print("Found minimum: ", func.found_minimum())
    func.plot()

In [None]:
benchmark = bf.Rosenbrock(2)
initial_points = [[2.0, -2.0], [1.0, 1.5], [0.25, -0.25]]
max_iters = [1, 10, 100]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

In [None]:
benchmark = bf.Hypersphere(2)
initial_points = [[4.0, -4.0], [1.0, 1.5], [0.5, -0.5]]
max_iters = [1, 10, 100]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

In [None]:
benchmark = bf.Ackley(2)
initial_points = [[10.0, 10.0], [10.0, 5.0], [2.0, 0.5]]
max_iters = [1, 10, 100]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

In [None]:
benchmark = bf.Keane(2)
initial_points = [[10.0, 10.0], [10.0, 5.0], [2.0, 0.5]]
max_iters = [1, 10, 100, 1000]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

In [None]:
benchmark = bf.DeJong5()
initial_points = [[10.0, 10.0], [10.0, 5.0], [2.0, 0.5]]
max_iters = [1, 10, 100]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark)
        powell(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial point: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minimum: ", func.found_minimum())
        func.plot()

## Exercise 4/4: NELDER MEAD OPTIMIZATION

In this exercise we will focus on the Nelder Mead optimization algorithm.
Similar to the previous exercise, answer the following questions:

### Questions
- What happens when varying the parameters of the algorithm?
- How they influence the optimization process?
- The effects of these parameters is the same across different functions?
- How does this algorithm compare to the previous?

In [None]:
benchmark_function = bf.Hypersphere(2)
initial_points = [[4.0, -4.0], [1.75, 1.0], [0.5, 0.5]]
max_iters = [10, 100, 1000]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark_function)
        nelder_mead(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial guess: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minumun: ", func.found_minimum())
        func.plot()

In [None]:
benchmark_function = bf.Rastrigin(2)
initial_points = [[4.0, -4.0], [1.75, 1.0], [0.25, 0.25]]
max_iters = [10, 100, 1000]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark_function)
        nelder_mead(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial guess: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minumun: ", func.found_minimum())
        func.plot()

In [None]:
benchmark_function = bf.Rosenbrock(2)
initial_points = [[2.0, -2.0], [0.25, -0.25], [1.0, 1.5]]
max_iters = [10, 100, 1000]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark_function)
        nelder_mead(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial guess: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minumun: ", func.found_minimum())
        func.plot()

In [None]:
benchmark_function = bf.Ackley(2)
initial_points = [[20.0, -25.0], [10.0, 5.0], [2.0, 0.5]]
max_iters = [10, 100, 1000]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark_function)
        nelder_mead(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial guess: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minumun: ", func.found_minimum())
        func.plot()

In [None]:
benchmark_function = bf.Keane(2)
initial_points = [[10.0, 10.0], [5.0, 7.0], [1.5, 0.75]]
max_iters = [10, 100, 1000]

for x_0 in initial_points:
    for max_iter in max_iters:
        func = OptFun(benchmark_function)
        nelder_mead(func, x_0, max_iter)

        print("Benchmark function: ", func.name)
        print("Initial guess: ", x_0)
        print("Max iterations: ", max_iter)
        print("Real minimum: ", func.minima()[0])
        print("Found minumun: ", func.found_minimum())
        func.plot()

In [None]:
# tests

benchmark_function = bf.Ackley(2)

x_0 = [25.0, 25.0]
iterations = [1000]
for max_iter in iterations:
    func = OptFun(benchmark_function)
    res = nelder_mead(func, x_0, max_iter)

    func.plot()

    print("Initial guess: ", x_0)
    print("Real minimum: ", func.minima()[0])
    print("Found minumun: ", func.found_minimum())
    print("Iterations: ", max_iters)

x_0 = [20.0, 20.0]
iterations = [1000]
for max_iter in iterations:
    func = OptFun(benchmark_function)
    res = nelder_mead(func, x_0, max_iter)

    func.plot()

    print("Initial guess: ", x_0)
    print("Real minimum: ", func.minima()[0])
    print("Found minumun: ", func.found_minimum())
    print("Iterations: ", max_iters)

Obviously the closer the initial guess if to the minimum the faster the convergence and the best the obtained result.

Note how since the function is symmetric, if the starting points are symmetric wrt the minimum the shape of the function of the convergence process is the same albeit at different scales. (The point (1.75, 1.5) that isn't symmetric has a different shape)

## Final questions
- How the benchmark functions influence the optimization algorithms? There is an algorithm which is always better than the other?
- The choiche of the parameters is influenced by the function to optimize? And how the algorithm are influenced by the parameters?

In [None]:
# TODO: compare the different optimization algorithms