Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

fail to mxnet to onnx #18509

Open
alicera opened this issue Jun 8, 2020 · 12 comments
Open

fail to mxnet to onnx #18509

alicera opened this issue Jun 8, 2020 · 12 comments

Comments

@alicera
Copy link

alicera commented Jun 8, 2020

cause;
I convert mxnet model to onnx ,it is ok.
But when I use
onnx2trt *.onnx -o *.trt
the model is fail

env :
pip install mxnet-cu102
pip install onnx==1.3.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

model ;
https://www.dropbox.com/s/akxeqp99jvsd6z7/model-MobileFaceNet-arcface-ms1m-refine-v1.zip?dl=0
ref : https://github.com/deepinsight/insightface

mxnet to onnx code;

`
import mxnet as mx
import numpy as np
from mxnet.contrib import onnx as onnx_mxnet
sym = './model/model-symbol.json'
params = './model/model-0000.params'
input_shape = (1,3,112,112)
onnx_file = './mxnet_exported_mobileface.onnx'
converted_model_path = onnx_mxnet.export_model(sym, params, [input_shape], np.float32, onnx_file)
print("finish ")

`

onnx2trt *.onnx -o *.trt
Error;
Parsing model [2020-06-08 04:45:16 ERROR] (Unnamed Layer* 10) [Parametric ReLU]: slope tensor must be unidirectional broadcastable to input tensor [2020-06-08 04:45:16 ERROR] (Unnamed Layer* 10) [Parametric ReLU]: slope tensor must be unidirectional broadcastable to input tensor [2020-06-08 04:45:16 ERROR] (Unnamed Layer* 10) [Parametric ReLU]: slope tensor must be unidirectional broadcastable to input tensor While parsing node number 5 [Conv -> "conv_2_dw_conv2d"]: ERROR: /opt/onnx-tensorrt/builtin_op_importers.cpp:463 In function importConv: [8] Assertion failed: nbSpatialDims == kernel_weights.shape.nbDims - 2

@alicera alicera added the Bug label Jun 8, 2020
@TMVector
Copy link

TMVector commented Jun 8, 2020

Handy dandy colab notebook reproduces a similar error in ONNX Runtime, implying it's an error in the model or in the exporter.
https://colab.research.google.com/drive/163tkcoYT09Ap31vYotWlhSOshlgWd5V-?usp=sharing

@szha
Copy link
Member

szha commented Jun 8, 2020

@alicera thanks for reporting this and @TMVector thanks for sharing the notebook for reproducing the issue. Would you mind trying out mxnet nightly builds for 1.7 and 2.0 and see if the issue still exists?

depending on the result, @ciyongch may need to include the fix for the upcoming 1.7 release.

@szha
Copy link
Member

szha commented Jun 8, 2020

You can install 1.x nightly builds with

pip install --pre --upgrade 'mxnet<2' -f https://dist.mxnet.io/python

or 2.x nightly builds with

pip install --pre --upgrade 'mxnet' -f https://dist.mxnet.io/python

@TMVector
Copy link

TMVector commented Jun 9, 2020

Just attaching the model zip to get a stable URL: model-MobileFaceNet-arcface-ms1m-refine-v1.zip

@TMVector
Copy link

TMVector commented Jun 9, 2020

Thanks for the reply @szha. I got the same error with 1.x and 2.x nightly using the commands you provided (notebook is updated):

RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running PRelu node. Name:'conv_1_relu' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:351 void onnxruntime::BroadcastIterator::Init(int64_t, int64_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 56 by 64

@ciyongch
Copy link
Contributor

ciyongch commented Jun 9, 2020

As confirmed by @TMVector both 1.x and 2.x had such issue. I'm wondering if there's anyone would help to fix this? Thanks!

@szha
Copy link
Member

szha commented Jun 9, 2020

@sandeep-krishnamurthy is this something your team could help out?

@leezu
Copy link
Contributor

leezu commented Jun 9, 2020

Have you tried onnx 1.5.0? @alicera

@ciyongch
Copy link
Contributor

Hi @sandeep-krishnamurthy @szha @leezu , may I know if this is a block issue for 1.7.0. thanks?

@sandeep-krishnamurthy
Copy link
Contributor

I tested with MXNet 1.5.0 / 1.6.0 and unfortunately, the issue exists there as well.
This is a bug that we need to be fixed. But, I think this is not a blocker for v1.7.0 and can be listed as known issue. Suggestions @szha @leezu ?

@cloudhan
Copy link

cloudhan commented Jul 2, 2020

You can monkey patch the conversion script with

from mxnet.contrib import onnx as onnx_mxnet
from mxnet.contrib.onnx.mx2onnx.export_onnx import MXNetGraph as mx_op
from mxnet.contrib.onnx.mx2onnx._op_translations import get_inputs, get_boolean_attribute_value, parse_helper, convert_string_to_list

# FUCKSHIT: 1d shape of prelu gamma is not unidirectional broadcastable!!!
@mx_op.register("LeakyReLU")
def convert_leakyrelu(node, **kwargs):
    # write your own conversion code here. hint: unsqueeze the gamma

@leezu
Copy link
Contributor

leezu commented Jul 2, 2020

@cloudhan would you like to open a PR with a fix?

@szha szha added the ONNX label Jul 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

7 participants