# ONNX In MLRun

A collection of ONNX features in one MLRun function. The function includes the following handlers:

1. [to_onnx](#handler1) - Convert your model into `onnx` format.
2. [optimize](#handler2) - Perform ONNX optimizations using `onnxmodeloptimizer` on a given ONNX model.

<a id="handler1"></a>

## 1. to_onnx

### 1.1. Docs
Convert the given model to an ONNX model.

#### Parameters:
* **`context`**: `mlrun.MLClientCtx` - The MLRun function execution context
* **`model_name`**: `str` - The model's name.
* **`model_path`**: `str` - The model path store object.
* **`onnx_model_name`**: `str = None` - The name to use to log the converted ONNX model. If not given, the given `model_name` will be used with an additional suffix `_onnx`. Defaulted to None.
* **`framework`**: `str = None` - The model's framework. If None, it will be read from the 'framework' label of the model artifact provided. Defaulted to None.
* **`framework_kwargs`**: `Dict[str, Any] = None` - Additional arguments each framework may require in order to convert to ONNX. To get the doc string of the desired framework onnx conversion function, pass "help=True".

#### Supported keyword arguments (`framework_kwargs`) per framework:
`tf.keras`:
* **`input_signature`**: `List[Tuple[Tuple[int], str]] = None` - A list of the input layers shape and data type properties. Expected to receive a list where each element is an input layer tuple. An input layer tuple is a tuple of:
  * [0] = Layer's shape, a tuple of integers.
  * [1] = Layer's data type, a dtype numpy string.

  If None, the input signature will be tried to be read automatically before converting to ONNX. Defaulted to None.
* **`optimize_model`**: `bool = True` - Whether to optimize the ONNX model using 'onnxoptimizer' before saving the model. Defaulted to True.

### 1.2. Demo

We will use MobileNetV2 as our model and convert it to ONNX using the `to_onnx` handler.

1. First we will set a temporary artifact path for our model to be saved in and choose the models names:

In [1]:
import os
from tempfile import TemporaryDirectory

# Create a temporary directory for the model artifact:
ARTIFACT_PATH = TemporaryDirectory().name
os.makedirs(ARTIFACT_PATH)

# Choose our model's name:
MODEL_NAME = "mobilenetv2"

# Choose our ONNX version model's name:
ONNX_MODEL_NAME = "onnx_mobilenetv2"

# Choose our optimized ONNX version model's name:
OPTIMIZED_ONNX_MODEL_NAME = "optimized_onnx_mobilenetv2"

2. Download the model from `keras.applications` and log it with MLRun's `TFKerasModelHandler`:

In [2]:
# mlrun: start-code

In [3]:
from tensorflow import keras

import mlrun
import mlrun.frameworks.tf_keras as mlrun_tf_keras


def get_model(context: mlrun.MLClientCtx, model_name: str):
    # Download the MobileNetV2 model:
    model = keras.applications.mobilenet_v2.MobileNetV2()

    # Initialize a model handler for logging the model:
    model_handler = mlrun_tf_keras.TFKerasModelHandler(
        model_name=model_name,
        model=model,
        context=context
    )

    # Log the model:
    model_handler.log()



In [4]:
# mlrun: end-code

3. Create the function using MLRun's `code_to_function` and run it:

In [5]:
import mlrun


# Create the function parsing this notebook's code using 'code_to_function':
get_model_function = mlrun.code_to_function(
    name="get_mobilenetv2",
    kind="job",
    image="mlrun/ml-models"
)

# Run the function to log the model:
get_model_run = get_model_function.run(
    handler="get_model",
    artifact_path=ARTIFACT_PATH,
    params={
        "model_name": MODEL_NAME
    },
    local=True
)

> 2021-11-01 13:30:49,331 [info] starting run get-mobilenetv2-get_model uid=682aa335dd494395a4eac5199f8e18c6 DB=
INFO:tensorflow:Assets written to: mobilenetv2/assets


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
default,...8e18c6,0,Nov 01 11:30:49,completed,get-mobilenetv2-get_model,kind=owner=guylhost=Guys-MacBook-Pro.local,,model_name=mobilenetv2,,mobilenetv2.zipmobilenetv2





> 2021-11-01 13:31:18,420 [info] run executed, status=completed


4. Import the `mlrun_onnx` MLRun function and run it:

In [6]:
# Import the ONNX function from the marketplace:
onnx_function = mlrun.import_function("function.yaml")

# Run the function to convert our model to ONNX:
to_onnx_run = onnx_function.run(
    handler="to_onnx",
    artifact_path=ARTIFACT_PATH,
    params={
        "model_name": MODEL_NAME,
        "model_path": get_model_run.outputs[MODEL_NAME],  # <- Take the logged model from the previous function.
        "onnx_model_name": ONNX_MODEL_NAME,
        "optimize_model": False  # <- For optimizing it later in the demo, we mark the flag as False
    },
    local=True
)

> 2021-11-01 13:31:18,473 [info] starting run mlrun-onnx-to_onnx uid=8d680c5bae994011821570b369166939 DB=./
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
default,...166939,0,Nov 01 11:31:18,completed,mlrun-onnx-to_onnx,kind=owner=guylhost=Guys-MacBook-Pro.local,,model_name=mobilenetv2model_path=store://artifacts/default/mobilenetv2:682aa335dd494395a4eac5199f8e18c6onnx_model_name=onnx_mobilenetv2optimize_model=False,,onnx_mobilenetv2.onnxonnx_mobilenetv2





> 2021-11-01 13:31:32,927 [info] run executed, status=completed


5. Now, listing the artifact directory we will see both our `tf.keras` model and the `onnx` model:

In [7]:
import os


print(os.listdir(ARTIFACT_PATH))

['onnx_mobilenetv2.onnx', 'mobilenetv2', 'model_spec.yaml', 'mobilenetv2.zip', 'onnx_mobilenetv2']


<a id="handler2"></a>

## 2. optimize

### 2.1. Docs
Use the onnxoptimizer package to optimize the ONNX model. The optimizations supported can be seen by calling 'onnxoptimizer.get_available_passes()'.

#### Parameters:
* **`context`**: `mlrun.MLClientCtx` - The MLRun function execution context
* **`model_name`**: `str` - The model's name.
* **`model_path`**: `str` - The model path store object.
* **`optimizations`**: `List[str] = None` - List of possible optimizations. If None, all the optimizations will be used. Defaulted to None.
* **`fixed_point`**: `bool = False` - Optimize the weights using fixed point. Defaulted to False.
* **`optimized_model_name`**: `str = None` - The name of the optimized model. If None, the original model will be overridden. Defaulted to None.

### 2.2. Demo

We will use our converted model from the last example and optimize it.

1. So, We will call now the `optimize` handler:

In [8]:
onnx_function.run(
    handler="optimize",
    artifact_path=ARTIFACT_PATH,
    params={
        "model_name": ONNX_MODEL_NAME,
        "model_path": to_onnx_run.output(ONNX_MODEL_NAME),
        "optimized_model_name": OPTIMIZED_ONNX_MODEL_NAME,
    },
    local=True
)

> 2021-11-01 13:31:32,943 [info] starting run mlrun-onnx-optimize uid=d7643a5e26094c69ade921dae36edd42 DB=./


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
default,...6edd42,0,Nov 01 11:31:32,completed,mlrun-onnx-optimize,kind=owner=guylhost=Guys-MacBook-Pro.local,,model_name=onnx_mobilenetv2model_path=store://artifacts/default/onnx_mobilenetv2:8d680c5bae994011821570b369166939optimized_model_name=optimized_onnx_mobilenetv2,,optimized_onnx_mobilenetv2.onnxoptimized_onnx_mobilenetv2





> 2021-11-01 13:31:33,378 [info] run executed, status=completed


<mlrun.model.RunObject at 0x166da96d0>

3. And now our model was optimized and can be seen under the `ARTIFACT_PATH`:

In [9]:
print(os.listdir(ARTIFACT_PATH))

['onnx_mobilenetv2.onnx', 'mobilenetv2', 'optimized_onnx_mobilenetv2', 'model_spec.yaml', 'mobilenetv2.zip', 'onnx_mobilenetv2', 'optimized_onnx_mobilenetv2.onnx']


Lastly, run this code to clean up the models:

In [10]:
import shutil


shutil.rmtree(ARTIFACT_PATH)