# Evaluators

In [None]:
from overrides import overrides
from typing import List, Optional
from archai.discrete_search.api import ArchaiModel

We will use SegmentationDag search space for this example

In [None]:
from archai.discrete_search.search_spaces.cv import SegmentationDagSearchSpace

In [None]:
ss = SegmentationDagSearchSpace(nb_classes=1, img_size=(64, 64), max_layers=5, seed=11)

In [None]:
m = ss.random_sample()
m.arch.view()

`SegmentationDagSearchSpace` is a subclass of `EvolutionarySearchSpace`, so `mutate` and `crossover` methods are already implemented

In [None]:
ss.mutate(m).arch.view()

## Evaluating models

`Evaluators` are the main tool used to evaluate architectures in given criteria (task performance, speed, size, etc.). Archai supports two types of `Evaluators`:

* ModelEvaluator ([archai.discrete_search.api.evaluator.ModelEvaluator](https://microsoft.github.io/archai/reference/api/archai.discrete_search.api.html#module-archai.discrete_search.api.evaluator.ModelEvaluator))
    * Evaluates a model: `ModelEvaluator.evaluate(model, budget)`

* AsyncModelEvaluator ([archai.discrete_search.api.evaluator.AsyncModelEvaluator](https://microsoft.github.io/archai/reference/api/archai.discrete_search.api.html#module-archai.discrete_search.api.evaluator.AsyncModelEvaluator)):
    * Sends an evaluation job: `AsyncModelEvaluator.send(model, budget)`
    * Fetches all evaluation jobs from the queue: `AsyncObjective.fetch_all()` 
  
A synchronous evaluator (`ModelEvaluator`) is computed by the search algorithm in a sequential fashion, while an asynchronous evaluator (`AsyncModelEvaluator`) sends evaluation jobs to a queue and fetches the results later, and thus can be used to evaluate models remotely or in a distributed fashion.

The `ArchaiModel` object passed to the evaluator objects can be used to access the architecture, if necessary. Some objectives will actively use the a dataset (e.g task accuracy), while others (e.g FLOPs, latency, memory) may not.

The `budget` argument, if provided, is a multiplier value used by search algorithms like `SucessiveHalving` to specify how much compute should be spent on the evaluation.

Read more about Evaluators [here](https://microsoft.github.io/archai/reference/api/archai.discrete_search.api.html#module-archai.discrete_search.api.evaluator). You can find a list of built-in evaluators in `archai.discrete_search.evaluators`.

### Example: Using a built-in evaluator (`AvgOnnxLatency`)

Let's use a built-in evaluator to measure ONNX latency of PyTorch models

In [None]:
from archai.discrete_search.evaluators import AvgOnnxLatency

In [None]:
onnx_latency_obj = AvgOnnxLatency(input_shape=(1, 3, 64, 64))
onnx_latency_obj.evaluate(model=ss.random_sample(), budget=None)

### Custom Evaluator Example

Let's create a simple custom `ModelEvaluator` that counts the number of modules in a model

In [None]:
from archai.api.dataset_provider import DatasetProvider
from archai.discrete_search.api import ModelEvaluator

In [None]:
class NumberOfModules(ModelEvaluator):
    ''' Class that measures the size of a model by the number of torch modules '''
    
    @overrides
    def evaluate(self, model: ArchaiModel,
                 budget: Optional[float] = None):
        return len(list(model.arch.modules()))

In [None]:
m = ss.random_sample()

In [None]:
my_objective = NumberOfModules()
my_objective.evaluate(m)

### Useful Evaluators

* [RayParallelEvaluator](https://microsoft.github.io/archai/reference/api/archai.discrete_search.objectives.html#module-archai.discrete_search.evaluators.ray) - Wraps an existing `ModelEvaluator` into a new `AsyncModelEvaluator` that runs evaluation jobs using multiple Ray workers.

* [EvaluationFunction](https://microsoft.github.io/archai/reference/api/archai.discrete_search.objectives.html#module-archai.discrete_search.evaluators.functional) - Wraps a function that takes (model, budget) arguments and creates a `ModelEvaluator`

#### Example: Parallelizing NumberOfModules

Let's use `RayParallelEvaluator` to make our custom evaluator `NumberOfModules` run more efficiently.

In [None]:
from archai.discrete_search.evaluators import RayParallelEvaluator

In [None]:
my_objective_parallel = RayParallelEvaluator(
    NumberOfModules(), 
    timeout=10, # Timeout in seconds
    num_cpus=1.0 # Each evaluation job will use a CPU core
)

`my_objective_parallel` is now an `AsyncModelEvaluator` object. We can send evaluation jobs calling `AsyncModelEvaluator.send(model, budget)`:

In [None]:
model_list = [ss.random_sample() for _ in range(10)]

for model in model_list:
    print(f'Dispatching job for {model.archid}')
    my_objective_parallel.send(model, budget=None)

We can fetch and clear all jobs from the job queue by calling `AsyncModelEvaluator.fetch_all()`

In [None]:
my_objective_parallel.fetch_all()

After that, job queue should be empty

In [None]:
assert my_objective_parallel.fetch_all() == []

#### Example: Wrapping custom training code into an Evaluator

Let's consider the problem of measuring the task performance on a specific dataset with custom training code. 

In [None]:
from archai.datasets.cv.mnist_dataset_provider import MnistDatasetProvider
from archai.discrete_search.evaluators import EvaluationFunction

Datasets in Archai are defined using dataset providers. We will use the built-in `MnistProvider` dataset provider for the MNIST dataset.

In [None]:
dataset_provider = MnistDatasetProvider()

We can now wrap custom training code easily using the `EvaluationFunction` wrapper:

```python
def custom_training_val_performance(model, budget=None):
    tr_data = dataset_provider.get_train_dataset()
    val_data = dataset_provider.get_val_dataset()

    tr_dl = torch.utils.data.DataLoader(tr_data, shuffle=True, batch_size=16)
    val_dl = torch.utils.data.DataLoader(val_data, shuffle=True, batch_size=16)

    optimizer = torch.optim.Adam(model.arch.parameters(), lr=1e-3)
    ...

    for batch in tr_dl:
        ...

    for batch in val_dl:
        ...

    return validation_metric
```

```python

# Wraps custom training function into a ModelEvaluator
custom_evaluator = EvaluationFunction(custom_traininb_val_performance)

# Evaluates an architecture from the search space
custom_evaluator.evaluate(ss.random_sample(), budget=None)
```

See the next notebook for a complete example using a custom training objectives