Pytorch2onnx #3075

drcut · 2020-06-19T07:47:11Z

Support convert RetinaNet from Pytorch to ONNX.
We can verify the computation results between Pytorch and ONNX.
We do several things in this PR:
[1] Replace some Pytorch op that are not supported by ONNX
[2] Replace some dynamic shape by static shape, as ONNX only support constant shape in some case
[3] Fix some bugs in Pytorch1.3 while converting to ONNX, which may cause numerical error while running by onnxruntime
[4] Update tool/pytorch2onnx.py file with our new API

drcut · 2020-06-19T08:18:17Z

What does this from onnx_util.symbolic import register_extra_symbolics mean, without this module @drcut

This function is in tools/onnx_util/symbolic.py. This is for users who use pytorch1.3. There are some bugs in Pytorch1.3's onnx part, besides, it does not implement TopK op. So you can regard register_extra_symbolics as a mock patch.
Thanks

drcut · 2020-06-19T08:30:31Z

Can you provide me with an onnx model of retinanet? I use it for testing, the coco data set is fine, resnet50fpn, thank you @drcut

As ONNX can not support dynamic input shape, I believe it's much more convenient for you if you can run the code locally, so that you can set whatever the input shape you want.
Would you please get the model with the below command:
python tools/pytorch2onnx.py configs/retinanet/retinanet_r50_fpn_1x_coco.py --checkpoint checkpoints/retinanet/retinanet_r50_fpn_1x_20181125-3d3c2142.pth --verify --shape 20 20 --output_file retinanet.onnx
If there is anything goes wrong, please tell me and maybe I can help you
Thanks

drcut · 2020-06-19T08:39:28Z

I used the command "python tools/pytorch2onnx.py --config ./configs/retinanet/retinanet_r50_fpn_1x_coco.py --checkpoint ./retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth --output_file ./6n.onnx", but the error was "Runtime" Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type numpy.ndarray”, I now urgently need a problem-free retinanet onnx model test, because tomorrow I report to the teacher, so I hope you can provide it, thank you @drcut

OK, just for this emergency case :)
Would you please provide me the input shape you need?

drcut · 2020-06-19T09:07:48Z

My email is "manhongnie@gmail.com"

I just tried to convert an ONNX model. There is some numerical difference between Pytorch and ONNX, maybe this is due to the dummy input. Anyway, here is the link:
https://pan.baidu.com/s/1GP-si3oDdTndoC82tZgQlA (password: dbju)
Hope this can help you

drcut · 2020-06-19T09:17:13Z

Okay, thank you, in fact, my input size does not matter, I just used it for testing.
@drcut

OK, Good luck with your tomorrow's meeting!

hellock · 2020-06-19T10:15:13Z

Have you tested mmdection to tensorrt, I have not succeeded

You need to implement it by yourself. It is not supported yet.

drcut · 2020-06-20T01:32:26Z

ok, but why does it even run onnxruntime incorrectly, is there any onnx node that does not support it?
@drcut

First, I use a pretrain pytorch model for converting, however, this pth file is not totally compatible with the master code, which means some op does not have its correspond pretrain value, so the original pytorch result is also some kinds low. Besides, although I am not sure, I believe there is some numerical difference between Pytorch and Onnxruntime, I will figure it out next week. Finally, why you said Resize should be converted by Upsample? Resize is a standard ONNX op which can be executed by ONNX runtime directly.

drcut · 2020-06-20T01:51:20Z

Because that place is Usample according to the structure, but it is recognized as Resize in opset11, but its output is still problematic, because it needs to be replaced and rewritten, and there is topK. I said that the result of using onnxruntime test is wrong, because the result is not recognized at all, if it is just a small error, this does not affect the result, then it is right
nvidia engineers use this piece of code

"Import torch.onnx.symbolic_opset10 as onnx_symbolic
def upsample_nearest2d(g, input, output_size, *args):

Currently, TRT 5.1/6.0/7.0 ONNX Parser does not support all ONNX ops

needed to support dynamic upsampling ONNX forumlation

Here we hardcode scale=2 as a temporary workaround

scales = g.op("Constant", value_t=torch.tensor([1., 1., 2., 2.]))
return g.op("Resize", input, scales, mode_s="nearest")

onnx_symbolic.upsample_nearest2d = upsample_nearest2d"

To replace Upsample.
@drcut

I have only tested the onnx on cpu using onnxruntime. Maybe it does not work for GPU. In fact, we have been working on another part that supports the converting from onnx to trt, but it has not been published yet. I'm afraid you should implement the correspond part.

drcut · 2020-06-20T02:16:04Z

Can you give me your onnxruntime test code? Take a look at my test results, do I need this test result? As for the trt part, I also wrote the decode and nms parts, but I am currently distressed without the correct onnx.
@drcut

Please see the tools/pytorch2onnx.py, the onnx runtime code is executed while using --verify

drcut · 2020-06-20T03:27:18Z

Can you tell me about the environment in which you are running? I can't run
@drcut

Pytorch1.3 Python3.7.5 . Please make sure you use the correct branch

drcut · 2020-06-22T02:40:17Z

Can you do me a favor? Tell me how to add a new node Sigmiod on an onnx model, or what is the output of a certain node in the middle of viewing onnx?
@drcut

Sorry, I do not know how to directly modify onnx model. Onnx model is just a temporary file used to convert pytorch to other backend engine.

drcut · 2020-06-22T04:42:28Z

So how do you deal with ops that are not supported by tensorrt, such as "NonZero", "GatherND", "NonMaxSuppression", can these be made public? About your tensorrt implementation of retinanet
@drcut

Maybe in the future :). We implement TensorRT plugin for customized op

manhongnie · 2020-06-22T07:34:11Z

After testing your onnx is wrong, you can simplify the input and output
@drcut

drcut · 2020-06-22T08:19:16Z

After testing your onnx is wrong, you can simplify the input and output
@drcut

Would you please describe why it's incorrect?

drcut · 2020-06-22T08:19:33Z

After testing your onnx is wrong, you can simplify the input and output
@drcut

Would you please describe why it's incorrect?

manhongnie · 2020-06-22T08:49:28Z

On topk and usample, there is no operation, and its output is wrong
@drcut

manhongnie · 2020-06-22T08:52:07Z

I check that onnx supports these operations, but the onnx you transferred does not have this operation. You can see if your 1.3 pytorch supports this operation
@drcut

xvjiarui · 2020-06-29T15:51:02Z

mmdet/models/detectors/single_stage.py

@@ -93,6 +94,10 @@ def simple_test(self, img, img_metas, rescale=False):
        outs = self.bbox_head(x)
        bbox_list = self.bbox_head.get_bboxes(
            *outs, img_metas, rescale=rescale)
+        # return in advance when export to ONNX


Suggested change

# return in advance when export to ONNX

# skip post-processing when exporting to ONNX

mmdet/core/post_processing/bbox_nms.py

mmdet/ops/nms/nms_wrapper.py

xvjiarui · 2020-06-29T17:56:06Z

mmdet/core/bbox/transforms.py

@@ -97,8 +97,9 @@ def bbox2result(bboxes, labels, num_classes):
    if bboxes.shape[0] == 0:
        return [np.zeros((0, 5), dtype=np.float32) for i in range(num_classes)]
    else:
-        bboxes = bboxes.cpu().numpy()
-        labels = labels.cpu().numpy()
+        if isinstance(bboxes, torch.Tensor):


Is this necessary? If so, we need to update the docstring.

MMdet, after executing bbox2result, will return a np.ndarray value, which is not support by ONNX (ONNX can not trace np op, but only tensor op), so we can only convert the previous part into ONNX. So if we want to compare the result between Pytorch and ONNX, we have to use bbox2result to convert the output of ONNX. So this time, the input of bbox2result is np.ndarray (the ONNXruntime's output type)

We may keep this part unchanged and add use [bboxes[labels == i, :] for i in range(num_classes)] in pytorch2onnx()

xvjiarui · 2020-06-29T17:58:00Z

We may also need to update doc here.
We could list supported methods in the doc.

mmdet/core/anchor/anchor_generator.py

drcut

I have submitted a new version according to the reviewer which mainly modify the ga_rpn_head's code of calling nms.

drcut · 2020-06-30T03:57:10Z

mmdet/core/bbox/transforms.py

@@ -97,8 +97,9 @@ def bbox2result(bboxes, labels, num_classes):
    if bboxes.shape[0] == 0:
        return [np.zeros((0, 5), dtype=np.float32) for i in range(num_classes)]
    else:
-        bboxes = bboxes.cpu().numpy()
-        labels = labels.cpu().numpy()
+        if isinstance(bboxes, torch.Tensor):


MMdet, after executing bbox2result, will return a np.ndarray value, which is not support by ONNX (ONNX can not trace np op, but only tensor op), so we can only convert the previous part into ONNX. So if we want to compare the result between Pytorch and ONNX, we have to use bbox2result to convert the output of ONNX. So this time, the input of bbox2result is np.ndarray (the ONNXruntime's output type)

mmdet/core/post_processing/bbox_nms.py

mmdet/core/anchor/anchor_generator.py

xvjiarui · 2020-07-05T03:45:03Z

tools/pytorch2onnx.py

    parser.add_argument(
-        '--out', type=str, required=True, help='output ONNX filename')
+        '--verify', action='store_true', help='verify the onnx model')


Suggested change

'--verify', action='store_true', help='verify the onnx model')

'--verify', action='store_true', help='verify the onnx model output against pytorch output')

xvjiarui · 2020-07-05T03:47:07Z

tools/pytorch2onnx.py

    parser.add_argument('config', help='test config file path')
-    parser.add_argument('checkpoint', help='checkpoint file')


For object detection, we may make checkpoint a required argument. Without checkpoint, some branches may not be covered.

xvjiarui · 2020-07-05T03:50:25Z

tools/pytorch2onnx.py

+    one_img = mmcv.imread(input_img, 'color')
+    one_img = mmcv.imresize(one_img, input_shape[2:]).transpose(2, 0, 1)
+    # normalize the input images
+    one_img = torch.from_numpy((one_img - 128) / 256).unsqueeze(0).float()


Why normalization is fixed to 128?

I suggest making image norm a user input. The default could be imagenet mean/std.

I suggest making image norm a user input. The default could be imagenet mean/std.

After having some test, I decide to remove the normalized part. As without this step, we can still gain correct RetinaNet with default picture. Besides, the MMDet will raise an Error while we do not execute NMS while ONNX tracing.

Remove img_norm may get in incorrect results for other images.
I suggest making image norm a user input. The default could be imagenet mean/std.

drcut · 2020-07-24T08:23:01Z

好的，还有一个是关于onnx的问题，我想请问一下您，我有一个onnx，但是里面topk节点出问题了，现在我需要修改这个节点，但是在怎么定义这个节点时，我有不会了。
“new_scale_node1 = onnx.helper.make_node(
"TopK",
inputs=['1006', '1009'],
outputs=['1010', '1011'],
#value=onnx.helper.make_tensor('value', onnx.TensorProto.FLOAT, [4], [1, 256, 160, 160])
#values, indices = topk_sorted_implementation("1006", "1009", axis, largest)
)”
你能指导一下嘛？

Sorry, I don't know.

drcut · 2020-07-24T08:47:53Z

@drcut
I just built mmcv-full from source code, but I got an error
"RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type numpy.ndarray"
My order is
"Python tools/pytorch2onnx.py configs/retinanet/retinanet_r50_fpn_1x_coco.py ./retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth --output_file ./6nm.onnx"
Although you may find me very annoying, I hope you can help me solve this problem, otherwise I will keep coming. Although my teacher may have given up on me, I have not.

All right, I will give you a full command list, please give me some time.

drcut · 2020-07-27T03:43:25Z

@drcut
Then you are done, please inform me

Hi, I have tried to use the following commands to convert RetinaNet, I'm quite sure these commands work:
Step1: build Pytorch1.3 (Pytorch version is very important!!!)
Step2: download mmcv
git clone https://github.com/open-mmlab/mmcv.git
and build it from source
MMCV_WITH_OPS=1 pip install --user -e .
Step3: Download the corresponding mmdet
git clone https://github.com/open-mmlab/mmdetection.git onnx_mmdet
cd onnx_mmdet
checkout to the correct branch
git fetch origin pull/3075/head:pull_3075
git checkout pull_3075
Step4: Build mmdet
pip install --user -e .
Step5: Build onnxruntime
pip install --user onnxruntime
Step6: Convert RetinaNet
python -u tools/pytorch2onnx.py configs/retinanet/retinanet_r50_fpn_1x_coco.py retinanet_r50_fpn_2x_coco_20200131-fdb43119.pth --verify --output_file 6nm.onnx

drcut · 2020-07-27T03:52:23Z

@drcut
Do you mean that if I use pytorch1.5 it will not succeed?
OK　I will test it now and follow your steps, but I still hope you can update to pytorch1.5

Yeah, because the ONNX symbolic between Pytorch1.3 and Pytorch1.5 are different. However, I did not find the bug your report when I use Pytorch1.5.

drcut · 2020-07-27T04:47:43Z

@drcut
Then why is there a problem when I use version 1.5, could you send us your installation environment?

I think it's because you used the incorrect branch. You should pull my PR and checkout to it.
I just use a simple environment with Python 3.6.

drcut · 2020-07-27T05:05:13Z

@drcut
I am using
"Git clone https://github.com/drcut/mmdetection.git -b pytorch2onnx" Isn't this the right branch, or is there something wrong with my command?

This is the correct branch. I have no idea about your bug as I can not reproduce it.

hellock · 2020-07-30T14:52:54Z

mmdet/models/necks/fpn.py

@@ -182,6 +182,9 @@ def forward(self, inputs):
                                                 **self.upsample_cfg)
            else:
                prev_shape = laterals[i - 1].shape[2:]
+                # convert prev_shape from torch.Size to tuple
+                # so that we can convert F.interpolate into ONNX
+                prev_shape = tuple(int(e) for e in prev_shape)


prev_shape = tuple(laterals[i - 1].shape[2:])

hellock · 2020-07-30T14:53:53Z

tools/pytorch2onnx.py

 import torch
-from mmcv.ops import RoIAlign, RoIPool
+from mmcv.onnx.symbolic import register_extra_symbolics


We can raise an error message if mmcv version is low.

… pytorch

…x exporting

breAchyz · 2020-08-06T11:03:43Z

I meet a problem when I convert retinanet to onnx, could you help me?
I trained the retinanet using my own VOC dataset with 'retina_r50_fpn.py' and failed to convert it to onnx using code in pytorch2onnx.py. The error information as follows:

Traceback (most recent call last):
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1585, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/convert2onnx.py", line 160, in
normalize_cfg=normalize_cfg)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/convert2onnx.py", line 58, in pytorch2onnx
opset_version=opset_version)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/init.py", line 168, in export
custom_opsets, enable_onnx_checker, use_external_data_format)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 69, in export
use_external_data_format=use_external_data_format)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 488, in _export
fixed_batch_size=fixed_batch_size)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 334, in _model_to_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 291, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, _force_outplace=False, _return_inputs_states=True)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 278, in _get_trace_graph
outs = ONNXTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 361, in forward
self._force_outplace,
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 348, in wrapper
outs.append(self.inner(*trace_inputs))
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 548, in call
result = self._slow_forward(*input, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 534, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/base.py", line 173, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/base.py", line 153, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/single_stage.py", line 112, in simple_test
*outs, img_metas, rescale=rescale)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(*args, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 574, in get_bboxes
scale_factor, cfg, rescale)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 652, in _get_bboxes_single
cfg.max_per_img)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/post_processing/bbox_nms.py", line 59, in multiclass_nms
raise RuntimeError('[ONNX Error] Can not record NMS '
RuntimeError: [ONNX Error] Can not record NMS as it has not been executed this time

My environment is pytorch1.5 mmdet2.3.0 mmcv1.0.5.

drcut · 2020-08-06T11:20:54Z

I meet a problem when I convert retinanet to onnx, could you help me?
I trained the retinanet using my own VOC dataset with 'retina_r50_fpn.py' and failed to convert it to onnx using code in pytorch2onnx.py. The error information as follows:

Traceback (most recent call last):
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1585, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/convert2onnx.py", line 160, in
normalize_cfg=normalize_cfg)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/convert2onnx.py", line 58, in pytorch2onnx
opset_version=opset_version)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/init.py", line 168, in export
custom_opsets, enable_onnx_checker, use_external_data_format)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 69, in export
use_external_data_format=use_external_data_format)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 488, in _export
fixed_batch_size=fixed_batch_size)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 334, in _model_to_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 291, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, _force_outplace=False, _return_inputs_states=True)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 278, in _get_trace_graph
outs = ONNXTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 361, in forward
self._force_outplace,
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 348, in wrapper
outs.append(self.inner(*trace_inputs))
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 548, in call
result = self._slow_forward(*input, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 534, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/base.py", line 173, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/base.py", line 153, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/single_stage.py", line 112, in simple_test
*outs, img_metas, rescale=rescale)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(*args, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 574, in get_bboxes
scale_factor, cfg, rescale)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 652, in _get_bboxes_single
cfg.max_per_img)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/post_processing/bbox_nms.py", line 59, in multiclass_nms
raise RuntimeError('[ONNX Error] Can not record NMS '
RuntimeError: [ONNX Error] Can not record NMS as it has not been executed this time

My environment is pytorch1.5 mmdet2.3.0 mmcv1.0.5.

Hi.
As Pytorch using tracing to convert a model into ONNX, it can only record the operations which are executed. However, from your description, I think the model may not generate any legal bboxes, so the program will not execute NMS. So the converting result will be wrong.
So please use some input data which can generate legal bbox. You can also use some preprocess to modify the data.
Besides, I have not test on Pytorch1.5. If anything goes wrong, maybe Pytorch1.3 can help.
Best

breAchyz · 2020-08-06T13:18:38Z

I meet a problem when I convert retinanet to onnx, could you help me?
I trained the retinanet using my own VOC dataset with 'retina_r50_fpn.py' and failed to convert it to onnx using code in pytorch2onnx.py. The error information as follows:

Traceback (most recent call last):
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1585, in
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/pydevd.py", line 1015, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/ding/pycharm-community-2017.1.3/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/convert2onnx.py", line 160, in
normalize_cfg=normalize_cfg)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/convert2onnx.py", line 58, in pytorch2onnx
opset_version=opset_version)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/init.py", line 168, in export
custom_opsets, enable_onnx_checker, use_external_data_format)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 69, in export
use_external_data_format=use_external_data_format)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 488, in _export
fixed_batch_size=fixed_batch_size)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 334, in _model_to_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/onnx/utils.py", line 291, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, _force_outplace=False, _return_inputs_states=True)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 278, in _get_trace_graph
outs = ONNXTracedModule(f, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 361, in forward
self._force_outplace,
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/jit/init.py", line 348, in wrapper
outs.append(self.inner(*trace_inputs))
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 548, in call
result = self._slow_forward(*input, **kwargs)
File "/home/ding/miniconda3/envs/yz-mmdet2.3-tensorrt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 534, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/base.py", line 173, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/base.py", line 153, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/detectors/single_stage.py", line 112, in simple_test
*outs, img_metas, rescale=rescale)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(*args, **kwargs)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 574, in get_bboxes
scale_factor, cfg, rescale)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 652, in _get_bboxes_single
cfg.max_per_img)
File "/home/ding/yz/Image_Recognize/mm_RetinaNet/mmdetection/mmdet/core/post_processing/bbox_nms.py", line 59, in multiclass_nms
raise RuntimeError('[ONNX Error] Can not record NMS '
RuntimeError: [ONNX Error] Can not record NMS as it has not been executed this time

My environment is pytorch1.5 mmdet2.3.0 mmcv1.0.5.

Hi.
As Pytorch using tracing to convert a model into ONNX, it can only record the operations which are executed. However, from your description, I think the model may not generate any legal bboxes, so the program will not execute NMS. So the converting result will be wrong.
So please use some input data which can generate legal bbox. You can also use some preprocess to modify the data.
Besides, I have not test on Pytorch1.5. If anything goes wrong, maybe Pytorch1.3 can help.
Best

Thanks for your rapid reply.
It works for this error and the model is successfully converted to onnx format.

yhl41001 · 2020-08-21T10:21:38Z

/localdev/anaconda3/envs/mmdet01/lib/python3.7/site-packages/torch/onnx/symbolic_registry.py", line 91, in get_registered_op
return _registry[(domain, version)][opname]
KeyError: 'new_zeros'
转换的过程中会有这个错误，可以解决吗？

tianwen0110 · 2020-09-25T11:24:16Z

/localdev/anaconda3/envs/mmdet01/lib/python3.7/site-packages/torch/onnx/symbolic_registry.py", line 91, in get_registered_op
return _registry[(domain, version)][opname]
KeyError: 'new_zeros'
转换的过程中会有这个错误，可以解决吗？

I have met this problem, it caused by the version of pytorch. Update the pytorch version could solve this problem. My version is pytorch1.6

yhl41001 · 2020-09-27T08:26:07Z

@tianwen0110 好的，谢谢！

drcut mentioned this pull request Jun 19, 2020

Support convert Retinanet to ONNX #2562

Closed

drcut force-pushed the pytorch2onnx branch from f4eb9d9 to b2127cd Compare June 22, 2020 11:00

xvjiarui reviewed Jun 29, 2020

View reviewed changes

mmdet/core/post_processing/bbox_nms.py Show resolved Hide resolved

xvjiarui reviewed Jun 29, 2020

View reviewed changes

mmdet/ops/nms/nms_wrapper.py Outdated Show resolved Hide resolved

xvjiarui reviewed Jun 29, 2020

View reviewed changes

xvjiarui reviewed Jun 30, 2020

View reviewed changes

mmdet/core/anchor/anchor_generator.py Show resolved Hide resolved

drcut commented Jul 2, 2020

View reviewed changes

hellock mentioned this pull request Jul 2, 2020

Roadmap of MMDetection #2931

Open

xvjiarui reviewed Jul 5, 2020

View reviewed changes

hellock reviewed Jul 30, 2020

View reviewed changes

hanruobing and others added 12 commits August 4, 2020 16:59

Update pytorch2onnx.py which using new logic to convert pytorch to ONNX

adf6126

use standard API to check whether in ONNX convert process

4a3ec2a

only compare the score value while verifying results between ONNX and…

211e5d9

… pytorch

move import onnx before import torch, or something weird will happen

71a4c83

use real images for input

dd0b25c

modifying the way of calling nms

466b66f

modify docstring for bbox2result, and remove unnecessary part for onn…

bb53c86

…x exporting

modify the 'Convert to ONNX' part in docs

cb692b4

replace or to | in docstring

aba65f1

update according to the latest mmcv

538a501

add normalize part

73ad077

raise error while using low version mmcv

4db396e

drcut force-pushed the pytorch2onnx branch from 67a0b80 to 4db396e Compare August 4, 2020 09:05

xvjiarui added 2 commits August 4, 2020 23:54

minor update

4508b28

minor update

c8a4b14

hellock merged commit 2f32a47 into open-mmlab:master Aug 4, 2020

	# return in advance when export to ONNX
	# skip post-processing when exporting to ONNX

	'--verify', action='store_true', help='verify the onnx model')
	'--verify', action='store_true', help='verify the onnx model output against pytorch output')

		parser.add_argument('config', help='test config file path')
		parser.add_argument('checkpoint', help='checkpoint file')

Pytorch2onnx #3075

Pytorch2onnx #3075

Conversation

drcut commented Jun 19, 2020

drcut commented Jun 19, 2020

drcut commented Jun 19, 2020

drcut commented Jun 19, 2020

drcut commented Jun 19, 2020

drcut commented Jun 19, 2020

hellock commented Jun 19, 2020 • edited

drcut commented Jun 20, 2020

drcut commented Jun 20, 2020

Currently, TRT 5.1/6.0/7.0 ONNX Parser does not support all ONNX ops

needed to support dynamic upsampling ONNX forumlation

Here we hardcode scale=2 as a temporary workaround

drcut commented Jun 20, 2020

drcut commented Jun 20, 2020

drcut commented Jun 22, 2020

drcut commented Jun 22, 2020

manhongnie commented Jun 22, 2020

drcut commented Jun 22, 2020

drcut commented Jun 22, 2020

manhongnie commented Jun 22, 2020

manhongnie commented Jun 22, 2020

xvjiarui Jun 29, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xvjiarui Jul 5, 2020 • edited

Choose a reason for hiding this comment

xvjiarui commented Jun 29, 2020 • edited

drcut left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drcut commented Jul 24, 2020

drcut commented Jul 24, 2020

drcut commented Jul 27, 2020 • edited

drcut commented Jul 27, 2020

drcut commented Jul 27, 2020

drcut commented Jul 27, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

breAchyz commented Aug 6, 2020

drcut commented Aug 6, 2020

breAchyz commented Aug 6, 2020

yhl41001 commented Aug 21, 2020

tianwen0110 commented Sep 25, 2020

yhl41001 commented Sep 27, 2020

hellock commented Jun 19, 2020 •

edited

xvjiarui Jun 29, 2020 •

edited

xvjiarui Jul 5, 2020 •

edited

xvjiarui commented Jun 29, 2020 •

edited

drcut commented Jul 27, 2020 •

edited