Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Onnx Pipeline

This repository shows how to deploy and use Onnx pipeline with dockers including convert model, generate input and performance test.

# Prerequisites

Pull dockers from Azure. It should take several minutes.

In [1]:
!sh build.sh

Error response from daemon: Get https://ziylregistry.azurecr.io/v2/: unauthorized: authentication required
Using default tag: latest
latest: Pulling from onnx-converter
Digest: sha256:bf384add03095803386fafa7c7e167d3fe537792a1f3fa6fdfe315406fdcd473
Status: Image is up to date for ziylregistry.azurecr.io/onnx-converter:latest
Using default tag: latest
latest: Pulling from perf_test
Digest: sha256:0b93e6a1d4e4cd5e0057cf503fce53b8702d2445252b7837e844e50752d2a369
Status: Image is up to date for ziylregistry.azurecr.io/perf_test:latest


Install the onnxpipeline SDK

In [2]:
import onnxpipeline
pipeline = onnxpipeline.Onnxpip('mnist/model', print_logs=True)

# Convert model to ONNX

This image is used to convert model from major model frameworks to onnx. Supported frameworks are - caffe, cntk, coreml, keras, libsvm, mxnet, scikit-learn, tensorflow and pytorch.


You can run the docker image with customized parameters.

In [3]:
model = pipeline.convert_model(model_type='tensorflow')


2019-06-07 00:11:45,856 - INFO - Using tensorflow=1.13.1, onnx=1.5.0, tf2onnx=1.5.1/0c735a

2019-06-07 00:11:45,856 - INFO - Using opset <onnx, 7>

2019-06-07 00:11:45,981 - INFO - 

2019-06-07 00:11:46,015 - INFO - Optimizing ONNX model

2019-06-07 00:11:46,049 - INFO - After optimization: Add -2 (4->2), Const +1 (12->13), Gather +1 (0->1), Identity -5 (5->0), Transpose -6 (8->2)

2019-06-07 00:11:46,053 - INFO - 

2019-06-07 00:11:46,053 - INFO - Successfully converted TensorFlow model /mnt/model to ONNX

2019-06-07 00:11:46,072 - INFO - ONNX model is saved at /mnt/model/test/model.onnx

2019-06-07 00:11:46.390238: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA

2019-06-07 00:11:46.398449: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600000000 Hz

2019-06-07 00:11:46.399299: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5593ed23649

## Run parameters

(1) model: string

    Required. The path of the model that needs to be converted.

(2) output_onnx_path: string

    Required. The path to store the converted onnx model. Should end with ".onnx". e.g. output.onnx

(3) model_type: string

    Required. The name of original model framework. Available types are caffe, cntk, coreml, keras, libsvm, mxnet, scikit-learn, tensorflow and pytorch.

(4) model_inputs: string

    Optional. The model's input names. Required for tensorflow frozen models and checkpoints.

(5) model_outputs: string

    Optional. The model's output names. Required for tensorflow frozen models checkpoints.

(6) model_params: string

    Optional. The params of the model if needed.

(7) model_input_shapes: list of tuple

    Optional. List of tuples. The input shape(s) of the model. Each dimension separated by ','.

(8) target_opset: int

    Optional. Specifies the opset for ONNX, for example, 7 for ONNX 1.2, and 8 for ONNX 1.3. Defaults to 7.


# Performance test tool

You can run perf_test using command python perf_test.py [Your model path] [Output path on the docker]. You can use the same arguments as for onnxruntime_pert_test tool, e.g. -m for mode, -e to specify execution provider etc. By default it will try all providers available.

In [4]:
result = pipeline.perf_test(model=model)
# result = pipeline.perf_test()   # is ok, too

Cores:  6



/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:91 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 35: CUDA driver version is insufficient for CUDA runtime version ; GPU=32679 ; hostname=d4817cd8dcd6 ; expr=cudaSetDevice(device_id_); 

Stacktrace:



Stacktrace:







Setting thread pool size to 0





/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, con



Setting thread pool size to 0

Total time cost:0.441109

Total iterations:10

Average time cost:44.1109 ms



ngraph 44.1109



/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:91 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 35: CUDA driver version is insufficient for CUDA runtime version ; GPU=32589 ; hostname=d4817cd8dcd6 ; expr=cudaSetDevice(device_id_); 

Stacktrace:



Stacktrace:







Setting thread pool size to 0





/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxr







Setting thread pool size to 0

Total time cost:0.0112063

Total iterations:10

Average time cost:1.12063 ms



mkldnn_openmp OMP_NUM_THREADS=1, active 1.12063





Setting thread pool size to 0

Total time cost:0.0171221

Total iterations:10

Average time cost:1.71221 ms







Setting thread pool size to 0

Total time cost:0.0185428

Total iterations:10

Average time cost:1.85428 ms



mkldnn_openmp passive 1.85428





Setting thread pool size to 0

Total time cost:0.00677242

Total iterations:10

Average time cost:0.677242 ms







Setting thread pool size to 0

Total time cost:0.00589886

Total iterations:10

Average time cost:0.589886 ms



mkldnn_openmp active 0.589886





Setting thread pool size to 0

Total time cost:0.00638357

Total iterations:10

Average time cost:0.638357 ms







Setting thread pool size to 0

Total time cost:0.0059804

Total iterations:10

Average time cost:0.59804 ms



cpu 0.59804





Setting thread pool size to 0

Total time cost:0.00786069



# Run parameters

(1) convert-to-onnx-output: string

    Required. The path of the model that wants to be performed.
    
(2) output-perf-result-path: string

    Required. The path of the result.
    

In [5]:
pipeline.print_result(result)
#pipeline.print_result() # is ok, too

mkldnn_openmp active 0.589886 ms

cpu 0.59804 ms

mkldnn 0.67346 ms

cpu_openmp active 0.733781 ms

mkldnn_openmp OMP_NUM_THREADS=6, active 0.788804 ms

cpu_openmp OMP_NUM_THREADS=6, active 0.83225 ms

cpu_openmp OMP_NUM_THREADS=5, active 0.842299 ms

mkldnn_openmp OMP_NUM_THREADS=5, active 0.857851 ms

mkldnn_openmp OMP_NUM_THREADS=4, active 0.865186 ms

mkldnn_openmp OMP_NUM_THREADS=3, active 0.912573 ms

cpu_openmp OMP_NUM_THREADS=4, active 0.93001 ms

cpu_openmp OMP_NUM_THREADS=6, passive 0.936398 ms

cpu_openmp OMP_NUM_THREADS=5, passive 0.976594 ms

cpu_openmp OMP_NUM_THREADS=3, active 0.997364 ms

cpu_openmp OMP_NUM_THREADS=4, passive 1.01441 ms

cpu_openmp OMP_NUM_THREADS=2, active 1.01537 ms

mkldnn_openmp OMP_NUM_THREADS=2, active 1.05105 ms

cpu_openmp OMP_NUM_THREADS=3, passive 1.08854 ms

cpu_openmp OMP_NUM_THREADS=2, passive 1.09684 ms

mkldnn_openmp OMP_NUM_THREADS=1, active 1.12063 ms

mkldnn_openmp OMP_NUM_THREADS=5, passive 1.13045 ms

mkldnn_openmp OMP_NUM_THREADS=4,

# netron

In [5]:
# still trying
import netron
netron.start('../deepthink.onnx')
from IPython.display import IFrame
IFrame('http://10.161.18.106:8080', width=700, height=350)

Serving '../deepthink.onnx' at http://localhost:8080
