 # Convert your trained models to ONNX format
 Dlacc is able to accelerate trained deep learning models. First, you must convert your trained models (pytorch, tensorflow, mxnet, etc) to onnx format. Open Neural Network Exchange (ONNX) is an open standard format for representing machine learning models. ONNX is supported by a community of partners who have implemented it in many frameworks and tools.

 ## Pytorch to ONNX
 We import pre-trained resnet18 model.

In [None]:
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
model.eval()


 Then we need to construct an example input. contruct_dummy_input() is a useful function for this. You just need to specify a (list of) valid input(s) and also their corresponding datatype(s).

In [None]:
def contruct_dummy_input(input_shape, input_dtype):
    dummy_input = tuple(
        [
            torch.randn(*v).type(
                {
                    "int32": torch.int32,
                    "int64": torch.int64,
                    "float32": torch.float32,
                    "float64": torch.float64,
                }[input_dtype[i]]
            )
            for i, v in enumerate(input_shape)
        ]
    )
    return dummy_input

input_shape = [[10,3,224,224]]
input_dtype = ["float32"]

dummy_input = contruct_dummy_input(input_shape, input_dtype)
model.eval()
# Export the model
torch.onnx.export(
    model,  # model being run
    dummy_input,  # model input (or a tuple for multiple inputs)
    "resnet18.onnx",
    export_params=True,  # store the trained parameter weights inside the model file
    do_constant_folding=True,  # whether to execute constant folding for optimization
    verbose=True,
)


 # Parameter configuration
 Then we create a global json configuration file. The tool will run optimization process according to this json file.

In [None]:
config = {
    "job_id": "100000",
    "status": 0,
    "model_name" : "resnet",
    "model_path": "./resnet18.onnx",
    "platform_type": 0, 
    "model_type" : 2,
    "target": "llvm -mcpu cascadelake",
    "model_config":{
        "input_shape":{
            "input.1": [10,3,224,224],
        },
        "input_dtype":{
            "input.1": "float32",
        }
    },
    "tuning_config": {
        "mode": "ansor",
        "num_measure_trials": 24,
        "verbose_print": 0
    },
    "tuned_log":"",
    "need_benchmark": True
}


 Those fields are :
 - job_id : id of the job, random int
 - status:
     - 0: ready
     - 1: import to onnx finished
     - 2: ansor tuning finished (time cost overhead, skippable if tunned_log specified)
     - 3: compile finished
     - 4: job done
     - -1: error
 - model_name: name of the model
 - model_path: path to the model.
     - an absolute local path if plateform_type==LOCAL
     - a google storage bucket link if plateform_type==GOOGLESTORAGE
 - platform_type: type of source platform that stores the model file and input json.

     ```python
     class PlateformType(enum.IntEnum):
     	LOCAL = 0
     	GOOGLESTORAGE = 1
     	AWSSTORAGE = 2
     ```

 - model_type: type of model. Only onnx format is supported by now.

 ```python
 class ModelType(enum.IntEnum):
     PT = 0
     TF = 1
     ONNX = 2
     KERAS = 3
 ```

 - target: target hardware backend information
 - model_config
     - input_shape: shape of each input. **The first dimension must be batch size.**
     - input_dtype: datatype of each input.
 - tuning_config: tuning parameter configuration
     - mode: string value. ansor or autotvm. Only ansor for now.
     - num_measure_trials: an int value. More trials, better performance, more time costs.
         - when testing, 10 for a quick execution
         - when in production, 20000 for best performance.
     - verbose_print: if enbale verbose print
 - tuned_log: dev only. Tuning will not be executed if a tuned log is passed.
 - error_info: dev only. Exception information raised during execution.
 - need_benchmark: bool value. Whether need comparison with the original model.

 ## Run optimization

In [None]:
from dlacc.optimum import Optimum
import onnx
model = onnx.load("./resnet18.onnx")
output =[node.name for node in model.graph.output]

input_all = [node.name for node in model.graph.input]
input_initializer =  [node.name for node in model.graph.initializer]
net_feed_input = list(set(input_all)  - set(input_initializer))

print('Inputs: ', net_feed_input)
print('Outputs: ', output)




 The optimization process may produce many error massages. This is normal because the optimization engine will try some invalid schedules. You can safely ignore them if the tuning can continue, because these errors are isolated from the main process. After optimization, optimized model, statistics, logs will be saved in ./outputs folder. The optimized model will be saved in ./ouputs/optimized_model folder, containing 3 files, deploy_graph.json, deploy_lib.tar, deploy_param.params. You can reutilize those 3 files later in your own production environment.

In [None]:
optimum = Optimum("myresnet")
optimum.run(model, config)


 # Load optimized model and make a single prediction

In [None]:
inputs_dict = {}
predict_model = optimum.load_model("./outputs/optimized_model/", "llvm -mcpu cascadelake")
result = predict_model.predict(inputs_dict)
print(result)
