# Inference advisor

The ML Inference Advisor(MLIA) is a tool to help AI developers to design and optimize neural network models for
efficient inference on Arm targets by enabling performance analysis and providing
actionable advice early in the model development cycle.

MLIA could be used as standalone application and as a Python library with an established API.
The main goal of this notebook is to show the general use cases of the API. For running MLIA as an application please refer to the documentation.

## Initial setup

Before launching MLIA, the logging functionality should be configured. The API uses standard Python `logging` module.
The library comes with the helper function `setup_logging` with the following parameters:

- `logs_dir` - directory that will be used for storing log files
- `verbose` - enable verbose output

The following code snippet will enable logging with the log directory `mlia_logs` and non verbose output:

In [None]:
from mlia.cli.logging import setup_logging

logs_dir = "mlia_logs"
verbose = False

setup_logging(logs_dir, verbose)

The main entry point of the API is a function `get_advice`. It takes several parameters as input arguments:

- `profile` - name of the profile of the target device. MLIA comes with the number of predefined target profiles. Please refer to the documentation for more details.
- `model` - path to the model
- `category` - category of advice for MLIA to produce. MLIA supports 4 advice categories:
    - `all` - uses all tools available in MLIA for producing the advice
    - `operators` - focuses on operators compatibility aspects of the advice
    - `optimization` - performs number of model optimizations and generates advice based on results
    - `performance` - estimates model inference time and suggests ways of further improvements
- `optimization_targets` - provides a way to explore effects of different model optimizations
- `output` - path to the file where MLIA could save report. MLIA tries to figure out expected format of the report by looking on the extension of the provided file.
- `working_dir` - path to the directory where MLIA will store intermediate files during execution, e.g. converted and optimized models

In [None]:
from mlia.api import get_advice

profile = "U55-256"
model = "../tests/test_resources/models/simple_model.h5"

If only `profile` and `model` are provided then MLIA will use the advice category `all`

In [None]:
get_advice(profile, model)

If no optimization settings are provided MLIA will use the deafult ones:`[pruning: 0.5, clustering: 32]`. The `optimizations` parameter could be used for exploring different optimization options. In the following example model will be pruned with the target 0.5 and results will be written into `mlia_output.json` in JSON format

In [None]:
optimizations = [
    {
        "optimization_type": "pruning",
        "optimization_target": 0.5
    }
]

get_advice(profile, model, "all", optimizations, output="mlia_output.json")

In a similar way other advice categories could be used. The next call will produce operator compatibility report

In [None]:
get_advice(profile, model, category="operators")

`performance` category could be useful for highlighting model inference metrics on the target device

In [None]:
get_advice(profile, model, category="performance")

Parameter `optimizations` could also be used with the category `optimization`

In [None]:
get_advice(profile, model, "optimization", optimizations)