Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Onnx Pipeline

This repository shows how to deploy and use Onnx pipeline with dockers including convert model, generate input and performance test.

# Prerequisites

Pull dockers from Azure. It should take several minutes.

In [1]:
!sh build.sh

Error response from daemon: Get https://ziylregistry.azurecr.io/v2/: unauthorized: authentication required
Using default tag: latest
latest: Pulling from onnx-converter
Digest: sha256:43036294bac2bc2c88e5a42ff85a5cd38ac966004b02e4a25ca54f83ca970010
Status: Image is up to date for ziylregistry.azurecr.io/onnx-converter:latest
Using default tag: latest
latest: Pulling from perf_test
Digest: sha256:0b93e6a1d4e4cd5e0057cf503fce53b8702d2445252b7837e844e50752d2a369
Status: Image is up to date for ziylregistry.azurecr.io/perf_test:latest


Install the onnxpipeline SDK

In [14]:
import onnxpipeline

# test tensorflow
#pipeline = onnxpipeline.Pipeline('mnist/model')

# test pytorch
#pipeline = onnxpipeline.Pipeline('pytorch')

# cntk 
#pipeline = onnxpipeline.Pipeline('cntk')

# keras
#pipeline = onnxpipeline.Pipeline('KerasToONNX')

# sklearn
#pipeline = onnxpipeline.Pipeline('sklearn')

# caffe
pipeline = onnxpipeline.Pipeline('caffe')

# empty
#pipeline = onnxpipeline.Pipeline()

# test mxnet fail
#pipeline = onnxpipeline.Pipeline('mxnet')


## Run parameters

(1) local_directory: string

    Required. The path of local directory where would be mounted to the docker. All operations will be executed from this path.

(2) mount_path: string

    Optional. The path where the local_directory will be mounted in the docker. Default is "/mnt/model".

(3) print_logs: boolean

    Optional. Whether print the logs from the docker. Default is True.


# Config information for ONNX pipeline

In [15]:
pipeline.config()

-----------Config----------------
           Container information: <docker.client.DockerClient object at 0x7f70b3f10ad0>
 Local directory path for volume: /home/chuche/notebook/caffe
Volume directory path in dockers: /mnt/model
                     Result path: result.txt
        Print logs in the docker: True


# Convert model to ONNX

This image is used to convert model from major model frameworks to onnx. Supported frameworks are - caffe, cntk, coreml, keras, libsvm, mxnet, scikit-learn, tensorflow and pytorch.


You can run the docker image with customized parameters.

In [16]:
# test tensorflow
#model = pipeline.convert_model(model_type='tensorflow')

# test pytorch
#model = pipeline.convert_model(model_type='pytorch', model='saved_model.pb', model_input_shapes='(1,3,224,224)')

# test cntk
#model = pipeline.convert_model(model_type='cntk', model='ResNet50_ImageNet_Caffe.model')

# test keras
#model = pipeline.convert_model(model_type='keras', model='keras_Average_ImageNet.keras', input_json='input.json', convert_json=True)

# test sklearn
#model = pipeline.convert_model(model_type='scikit-learn', model='sklearn_svc.joblib', initial_types=("float_input", "FloatTensorType([1,4])"), input_json='input.json', convert_json=True)

# test caffe
model = pipeline.convert_model(model_type='caffe', model='bvlc_alexnet.caffemodel', caffe_model_prototxt ='deploy.prototxt', input_json='input.json', convert_json=True)


# test mxnet
#model = pipeline.convert_model(model_type='mxnet', model='resnet.json', model_params='resnet.params', model_input_shapes='(1,3,224,224)')




Layer 0: Type: 'Input', Name: 'data'. Output(s): 'data'.

Ignoring batch size and retaining only the trailing 3 dimensions for conversion. 

Layer 1: Type: 'Convolution', Name: 'conv1'. Input(s): 'data'. Output(s): 'conv1'.

Layer 2: Type: 'ReLU', Name: 'relu1'. Input(s): 'conv1'. Output(s): 'conv1'.

Layer 3: Type: 'LRN', Name: 'norm1'. Input(s): 'conv1'. Output(s): 'norm1'.

Layer 4: Type: 'Pooling', Name: 'pool1'. Input(s): 'norm1'. Output(s): 'pool1'.

Layer 5: Type: 'Convolution', Name: 'conv2'. Input(s): 'pool1'. Output(s): 'conv2'.

Layer 6: Type: 'ReLU', Name: 'relu2'. Input(s): 'conv2'. Output(s): 'conv2'.

Layer 7: Type: 'LRN', Name: 'norm2'. Input(s): 'conv2'. Output(s): 'norm2'.

Layer 8: Type: 'Pooling', Name: 'pool2'. Input(s): 'norm2'. Output(s): 'pool2'.

Layer 9: Type: 'Convolution', Name: 'conv3'. Input(s): 'pool2'. Output(s): 'conv3'.

Layer 10: Type: 'ReLU', Name: 'relu3'. Input(s): 'conv3'. Output(s): 'conv3'.

Layer 11: Type: 'Convolution', Name: 'conv4'. Input

## Run parameters

(1) model: string

    Required. The path of the model that needs to be converted.

(2) output_onnx_path: string

    Required. The path to store the converted onnx model. Should end with ".onnx". e.g. output.onnx

(3) model_type: string

    Required. The name of original model framework. Available types are caffe, cntk, coreml, keras, libsvm, mxnet, scikit-learn, tensorflow and pytorch.

(4) model_inputs: string

    (tensorflow) Optional. The model's input names. Required for tensorflow frozen models and checkpoints.

(5) model_outputs: string

    (tensorflow) Optional. The model's output names. Required for tensorflow frozen models checkpoints.

(6) model_params: string 

    (mxnet) Optional. The params of the model if needed.

(7) model_input_shapes: list of tuple 

    (pytorch, mxnet) Optional. List of tuples. The input shape(s) of the model. Each dimension separated by ','.

(8) target_opset: int

    Optional. Specifies the opset for ONNX, for example, 7 for ONNX 1.2, and 8 for ONNX 1.3. Defaults to 7.
    
(9) caffe_model_prototxt: string

    (caffe) Optional. The filename of deploy prototxt for the caffe madel. 

(10) initial_types: tuple (string, string)

    (scikit-learn) Optional. A tuple consist two strings. The first is data type and the second is the size of tensor type e.g., ('float_input', 'FloatTensorType([1,4])')

(11) input_json: string

    Optional. Use JSON file as input parameters.

(12) convert_json: boolean
    
    Optional. Convert all parameters into JSON file for input parameters.


# Performance test tool

You can run perf_test using command python perf_test.py [Your model path] [Output path on the docker]. You can use the same arguments as for onnxruntime_pert_test tool, e.g. -m for mode, -e to specify execution provider etc. By default it will try all providers available.

In [17]:
#result = pipeline.perf_test(model=model, result="output.txt")
result = pipeline.perf_test()   # is ok, too

Cores:  6



/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:91 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 35: CUDA driver version is insufficient for CUDA runtime version ; GPU=32548 ; hostname=ad7d8f44b7ef ; expr=cudaSetDevice(device_id_); 

Stacktrace:



Stacktrace:







Setting thread pool size to 0





/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, con



Setting thread pool size to 0

Total time cost:0.0787948

Total iterations:10

Average time cost:7.87948 ms



ngraph 7.87948



/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:97 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] /home/artr/repo/onnxruntime/onnxruntime/core/providers/cuda/cuda_call.cc:91 bool onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*) [with ERRTYPE = cudaError; bool THRW = true] CUDA failure 35: CUDA driver version is insufficient for CUDA runtime version ; GPU=32762 ; hostname=ad7d8f44b7ef ; expr=cudaSetDevice(device_id_); 

Stacktrace:



Stacktrace:







Setting thread pool size to 0





/home/artr/repo/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:127 OrtCreateSession /home/artr/repo/onnxruntime/onnx



mkldnn_openmp OMP_NUM_THREADS=1, active 85.2826





Setting thread pool size to 0

Total time cost:0.211335

Total iterations:10

Average time cost:21.1335 ms







Setting thread pool size to 0

Total time cost:0.200294

Total iterations:10

Average time cost:20.0294 ms



mkldnn_openmp passive 20.0294





Setting thread pool size to 0

Total time cost:0.192117

Total iterations:10

Average time cost:19.2117 ms







Setting thread pool size to 0

Total time cost:0.190836

Total iterations:10

Average time cost:19.0836 ms



mkldnn_openmp active 19.0836





Setting thread pool size to 0

Total time cost:0.161365

Total iterations:10

Average time cost:16.1365 ms







Setting thread pool size to 0

Total time cost:0.15832

Total iterations:10

Average time cost:15.832 ms



cpu 15.832





Setting thread pool size to 0

Total time cost:0.189687

Total iterations:10

Average time cost:18.9687 ms







Setting thread pool size to 0

Total time cost:0.192735

Total iterations:10

# Run parameters

(1) model: string

    Optional. The path of the model that wants to be performed.
    
(2) result: string

    Optional. The path of the result.
    

In [5]:
pipeline.print_result(result)
#pipeline.print_result() # is ok, too

ngraph 0.172 ms

cpu_openmp OMP_NUM_THREADS=6, passive 0.24281 ms

mkldnn_openmp active 0.266838 ms

cpu_openmp OMP_NUM_THREADS=1, passive 0.267542 ms

mkldnn_openmp OMP_NUM_THREADS=6, passive 0.269421 ms

mkldnn_openmp OMP_NUM_THREADS=3, active 0.276784 ms

mkldnn_openmp OMP_NUM_THREADS=2, passive 0.281424 ms

cpu_openmp OMP_NUM_THREADS=4, passive 0.282803 ms

cpu_openmp OMP_NUM_THREADS=6, active 0.286684 ms

cpu_openmp OMP_NUM_THREADS=1, active 0.288663 ms

cpu_openmp OMP_NUM_THREADS=5, passive 0.290468 ms

mkldnn_openmp OMP_NUM_THREADS=5, passive 0.29485 ms

mkldnn_openmp OMP_NUM_THREADS=6, active 0.299117 ms

cpu_openmp OMP_NUM_THREADS=3, passive 0.300064 ms

mkldnn_openmp OMP_NUM_THREADS=4, passive 0.300399 ms

mkldnn_openmp OMP_NUM_THREADS=3, passive 0.301225 ms

mkldnn_openmp OMP_NUM_THREADS=1, passive 0.306567 ms

cpu_openmp OMP_NUM_THREADS=3, active 0.307462 ms

mkldnn_openmp OMP_NUM_THREADS=5, active 0.309917 ms

mkldnn_openmp OMP_NUM_THREADS=2, active 0.312901 ms

mkldnn_ope

# Run parameters

(1) result: string
Optional. The path of the result.

# netron

In [3]:
# only workable for notebook in the local server 
import netron
netron.start(model) # 'model.onnx'
from IPython.display import IFrame
IFrame('http://localhost:8080', width=700, height=350)


Stopping http://localhost:8080
Serving 'model.onnx' at http://localhost:8080
