# Efficient HPO with EvoX

In this chapter, we will explore how to use EvoX for hyperparameter optimization (HPO).

HPO plays a crucial role in many machine learning tasks but is often overlooked due to its high computational cost, which can sometimes take days to process, as well as the challenges involved in deployment.

With EvoX, we can simplify HPO deployment using the [HPOProblemWrapper](#HPOProblemWrapper) and achieve efficient computation by leveraging the vmap method and GPU acceleration.

## Transforming Workflow into Problem

```{image} /_static/HPO_structure.svg
:alt: HPO structure
:width: 600px
:align: center
```

The key to deploying HPO with EvoX is to transform the [workflows](#evox.workflows) into [problems](#evox.problems) using the [HPOProblemWrapper](#HPOProblemWrapper). Once transformed, we can treat the [workflows](#evox.workflows) as standard [problems](#evox.problems). The input to the 'HPO problem' consists of the hyperparameters, and the output is the evaluation metrics.

## The Key Component -- `HPOProblemWrapper`

To ensure the [HPOProblemWrapper](#HPOProblemWrapper) recognizes the hyperparameters, we need to wrap them using [Parameter](#Parameter). With this straightforward step, the hyperparameters will be automatically identified.

```python
class ExampleAlgorithm(Algorithm):
    def __init__(self,...): 
        self.omega = Parameter([1.0, 2.0]) # wrap the hyperparameters with `Parameter`
        self.beta = Parameter(0.1)
        pass

    def step(self):
        # run algorithm step depending on the value of self.omega and self.beta
        pass
```

## Utilizing the `HPOFitnessMonitor`

We provide an [HPOFitnessMonitor](#HPOFitnessMonitor) that supports calculating 'IGD' and 'HV' metrics for multi-objective problems, as well as the minimum value for single-objective problems.

It is important to note that the [HPOFitnessMonitor](#HPOFitnessMonitor) is a basic monitor designed for HPO problems. You can also create your own customized monitor flexibly using the approach outlined in [Deploy HPO with Custom Algorithms](#/guide/developer/custom_hpo_prob).

## A simple example

Here, we'll demonstrate a simple example of using EvoX for HPO. Specifically, we will use the [PSO](#PSO) algorithm to optimize the hyperparameters of the [PSO](#PSO) algorithm for solving the sphere problem.

Please note that this chapter provides only a brief overview of HPO deployment. For a more detailed guide, refer to [Deploy HPO with Custom Algorithms](#/guide/developer/custom_hpo_prob).

To start, let's import the necessary modules.

In [13]:
import torch

from evox.algorithms.pso_variants.pso import PSO
from evox.core import Problem, jit_class
from evox.problems.hpo_wrapper import HPOFitnessMonitor, HPOProblemWrapper
from evox.workflows import EvalMonitor, StdWorkflow

Next, we define a simple sphere problem.

In [14]:
@jit_class
class Sphere(Problem):
    def __init__(self):
        super().__init__()

    def evaluate(self, x: torch.Tensor):
        return (x * x).sum(-1)

Next, we can use the [StdWorkflow](#StdWorkflow) to wrap the [problem](#evox.problems), [algorithm](#evox.algorithms) and [monitor](#Monitor). Then we use the [HPOProblemWrapper](#HPOProblemWrapper) to transform the [StdWorkflow](#StdWorkflow) to an HPO problem.

In [15]:
torch.set_default_device("cuda" if torch.cuda.is_available() else "cpu")
inner_algo = PSO(10, -10 * torch.ones(10), 10 * torch.ones(10))
inner_prob = Sphere()
inner_monitor = HPOFitnessMonitor()
inner_monitor.setup()
inner_workflow = StdWorkflow()
inner_workflow.setup(inner_algo, inner_prob, monitor = inner_monitor)
# Transform the inner workflow to an HPO problem
hpo_prob = HPOProblemWrapper(iterations = 15, num_instances = 5, workflow = inner_workflow, copy_init_state = True)

The [HPOProblemWrapper](#HPOProblemWrapper) takes 4 arguments:
1. iterations: The number of iterations to be executed in the optimization process.
2. num_instances: The number of instances to be executed in parallel in the optimization process.
3. workflow: The workflow to be used in the optimization process. Must be wrapped by [jit_class](#jit_class).
4. copy_init_state: Whether to copy the initial state of the workflow for each evaluation. Defaults to `True`. If your workflow contains operations that IN-PLACE modify the tensor(s) in initial state, this should be set to `True`. Otherwise, you can set it to `False` to save memory.

We can verify whether the [HPOProblemWrapper](#HPOProblemWrapper) correctly recognizes the hyperparameters we define. Since no modifications are made to the hyperparameters across the 5 instances, they should remain identical for all instances.

In [16]:
params = hpo_prob.get_init_params()
print('init params:\n',params)

init params:
 {'self.algorithm.w': Parameter containing:
tensor([0.6000, 0.6000, 0.6000, 0.6000, 0.6000], device='cuda:0'), 'self.algorithm.phi_p': Parameter containing:
tensor([2.5000, 2.5000, 2.5000, 2.5000, 2.5000], device='cuda:0'), 'self.algorithm.phi_g': Parameter containing:
tensor([0.8000, 0.8000, 0.8000, 0.8000, 0.8000], device='cuda:0')}


We can also define a custom set of hyperparameter values. It is important to ensure that the number of hyperparameter sets matches the number of instances in the [HPOProblemWrapper](#HPOProblemWrapper). Additionally, custom hyperparameters must be provided as a dictionary and wrapped using the [Parameter](#Parameter).

In [17]:
params = hpo_prob.get_init_params()
# since we have 5 instances, we need to pass 5 sets of hyperparameters
params["self.algorithm.w"] = torch.nn.Parameter(torch.rand(5, 1), requires_grad=False)
params["self.algorithm.phi_p"] = torch.nn.Parameter(torch.rand(5, 1), requires_grad=False)
params["self.algorithm.phi_g"] = torch.nn.Parameter(torch.rand(5, 1), requires_grad=False)
result = hpo_prob.evaluate(params)
print(params)
print('result:\n',result)

{'self.algorithm.w': Parameter containing:
tensor([[0.3606],
        [0.8700],
        [0.8371],
        [0.9362],
        [0.6158]], device='cuda:0'), 'self.algorithm.phi_p': Parameter containing:
tensor([[0.1089],
        [0.8496],
        [0.2948],
        [0.4693],
        [0.5615]], device='cuda:0'), 'self.algorithm.phi_g': Parameter containing:
tensor([[0.6890],
        [0.1214],
        [0.7578],
        [0.8836],
        [0.6416]], device='cuda:0')}
result:
 tensor([104.5386, 125.3063,  45.4021, 159.3124,  27.9212], device='cuda:0')


Now, we use the [PSO](#PSO) algorithm to optimize the hyperparameters of the [PSO](#PSO) algorithm.

It is important to ensure that the population size of the [PSO](#PSO) matches the number of instances; otherwise, unexpected errors may occur.

Additionally, the solution needs to be transformed in the outer workflow, as the [HPOProblemWrapper](#HPOProblemWrapper) requires the input to be in the form of a dictionary.

In [18]:
class solution_transform(torch.nn.Module):
    def forward(self, x: torch.Tensor):
        return {"self.algorithm.w": x[:,0],
                "self.algorithm.phi_p": x[:,1],
                "self.algorithm.phi_g": x[:,2],
                }

outer_algo = PSO(5, 0 * torch.ones(3), 3 * torch.ones(3))
monitor = EvalMonitor(full_sol_history = False)
outer_workflow = StdWorkflow()
outer_workflow.setup(outer_algo, hpo_prob, monitor = monitor, solution_transform = solution_transform())
outer_workflow.init_step()
for _ in range(20):
    outer_workflow.step()
monitor = outer_workflow.get_submodule("monitor")
print('params:\n', monitor.topk_solutions, '\n')
print('result:\n', monitor.topk_fitness)

params:
 tensor([[0.3746, 2.1251, 1.0888]], device='cuda:0') 

result:
 tensor([1.9542], device='cuda:0')
