## Latency calculation with ENOT

This notebook describes how to calculate latency using ENOT framework.

### Main chapters of this notebook:
1. Initialize latency of search space (`SearchSpaceModel`)
1. Calculate latency of arbitrary model/module

## Initialize latency of search space (`SearchSpaceModel`)

To initialize latency of `SearchSpaceModel` import `SearchSpaceModel` from `enot.models` and `initialize_latency` function from `enot.latency`:

In [None]:
from enot.models import SearchSpaceModel
from enot.latency import initialize_latency

`initialize_latency` has the following signature:

```python
def initialize_latency(
    latency_type: str,
    search_space: SearchSpaceModel,
    inputs: Tuple[torch.Tensor, ...],
    **keyword_inputs
) -> Tuple[float, float, float, float]:
```

`latency_type (str)` — type of the latency to be initialized in `search_space`.
Now ENOT supports only multiply-accumulate (MAC) latency type.
For MAC latency initialization use `latency_type='mmac'`.

For most modules ENOT has built-in MAC calculator, but for unsupported modules it is possible to use third-party calculators:

- to use **PyTorch-OpCounter (thop)** third-party MAC calculator pass `latency_type='mmac.thop'`
- to use **PyTorch-estimate-flops (pthflops)** third-party MAC calculator pass `latency_type='mmac.pthflops'`

Note: third-party calculators complement built-in calculator, i.e. if built-in calculator knows how to calculate latency of module, then third-party calculator will not be used for this module.

`search_space` — `SearchSpaceModel` for latency calculation.

        
`inputs: Tuple[torch.Tensor, ...]` — `search_space` input.

Also *input keyword arguments* can be passed.


`initialize_latency` returns four values in the following order:
- latency of constant part of `search_space`
- sum of minimum latencies over all containers + constant part: $\sum\limits_{c \in C} \min\limits_{i \in P_c} \text{latency}(i) + K$
- sum of mean latencies over all containers + constant part: $\sum\limits_{c \in C} \frac{1}{|P_c|}\sum\limits_{i \in P_c} \text{latency}(i) + K$
- sum of maximum latencies over all containers + constant part: $\sum\limits_{c \in C} \max\limits_{i \in P_c} \text{latency}(i) + K$

where $C$ — set of all `SearchableOperationsContainer` in `search_space`, $P_c$ — operations in container $c$, $|P_c|$ - number of operations in container $c$, $K$ - latency of constant part of `search_space`.


Skip *return values* if these statistics are not needed.

For example, to calculate MAC-latency of search space from `Tutorial - getting started`:

In [None]:
import torch
from enot.models.mobilenet import build_mobilenet

In [None]:
model = build_mobilenet(
    search_ops=['MIB_k=3_t=6', 'MIB_k=5_t=6', 'MIB_k=7_t=6'],
    num_classes=10,
    blocks_out_channels=[24, 32, 64, 96, 160, 320],
    blocks_count=[2, 2, 2, 1, 2, 1],
    blocks_stride=[2, 2, 2, 1, 2, 1],
)
search_space = SearchSpaceModel(model).cpu()
inputs = torch.ones(1, 3, 244, 224)

Now MAC-latency of `search_space` can be initialize:

In [None]:
initialize_latency('mmac', search_space, (inputs, ));  # ; suppress output of statistics.

Or we can enable **PyTorch-OpCounter** third-party calculator and print statistics:

In [None]:
from enot.latency import min_latency
from enot.latency import mean_latency
from enot.latency import max_latency
from enot.latency import median_latency
from enot.latency import current_latency
from enot.latency import plot_latencies

In [None]:
container = initialize_latency('mmac.thop', search_space, (inputs, ))
print(f'Constant latency = {container.constant_latency}\n'
      f'Min latency: {min_latency(container)}\n'
      f'Mean latency: {mean_latency(container)}\n'
      f'Max latency: {max_latency(container)}\n'
      f'Median latency: {median_latency(container)}\n')

In [None]:
plot_latencies(container, (8, 8));

To get latency of `search_space`:

In [None]:
latency = current_latency(search_space)
print(f'Latency = {latency}')

## Calculate latency of arbitrary model

To calculate latency of arbitary model/module import `MacCalculatorThop` or `MacCalculatorPthflops` from `enot.latency`

In [None]:
from enot.latency import MacCalculatorThop
from enot.latency import MacCalculatorPthflops

Latency calculators have only one function with the following signature:

```python
def calculate(
    model: nn.Module,
    inputs: Tuple[torch.Tensor, ...],
    ignore_modules: Optional[list] = None,
    **options
) -> float:
```

So you can pass model, inputs and list of modules that you want to ignore in calculation as well as some additional options.

For example (model and inputs from previous example are used):

In [None]:
MacCalculatorThop().calculate(model, inputs)

In [None]:
MacCalculatorPthflops().calculate(model, inputs)