Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX Model #21

Closed
romil611 opened this issue Mar 29, 2022 · 18 comments
Closed

ONNX Model #21

romil611 opened this issue Mar 29, 2022 · 18 comments

Comments

@romil611
Copy link

Hi, I wish to convert it into an onnx model. When trying to run the pytorch2onnx.py present inside the tools directory, I get:

RuntimeError: Exporting the operator roll to ONNX opset version 11 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

Thanks.

@praeclarumjj3
Copy link
Collaborator

Hi, have you tried running the same file for the original MMSegmentation code?

@romil611
Copy link
Author

romil611 commented Mar 29, 2022

Sorry I don't understand. Can you please tell me which is the original code? This repo was linked in the paper.
I'm trying to use SeMask-FPN and had run this file.

I tried to add the torch.roll op found here but the conversion still failed. This step helped to resolve the previous error but I got this error instead:

Traceback (most recent call last):
  File "tools/pytorch2onnx.py", line 220, in <module>
    verify=args.verify)
  File "tools/pytorch2onnx.py", line 138, in pytorch2onnx
    opset_version=opset_version)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/__init__.py", line 276, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/utils.py", line 94, in export
    use_external_data_format=use_external_data_format)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/utils.py", line 698, in _export
    dynamic_axes=dynamic_axes)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/utils.py", line 465, in _model_to_graph
    module=module)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/utils.py", line 206, in _optimize_graph
    graph = torch._C._jit_pass_onnx(graph, operator_export_type)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/__init__.py", line 309, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/torch/onnx/utils.py", line 994, in _run_symbolic_function
    return symbolic_fn(g, *inputs, **attrs)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/mmcv/onnx/symbolic.py", line 17, in symbolic_fn
    g, interpolate_mode, args)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py", line 254, in _get_interpolate_attributes
    scales = _interpolate_get_scales_if_available(g, scales)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py", line 229, in _interpolate_get_scales_if_available
    available_scales = _maybe_get_const(scales[0], 'f') != -1 and not _is_none(
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py", line 64, in _maybe_get_const
    return _parse_arg(value, desc)
  File "/home/dreamvu/Downloads/PAL-Firmware-v3.2-HPD-NX-20211201T130112Z-001/PAL-Firmware-v3.2-HPD-NX/installations/dreamvu_ws/lib/python3.6/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py", line 33, in _parse_arg
    return float(tval)
ValueError: only one element tensors can be converted to Python scalars

My goal is to build a TensorRT engine for this model. This is my first time using the mmcv tools. Some more clarification and help would be greatly appreciated.

I'm using:
Pytorch: 1.8
mmcv-full: 1.2.0

@praeclarumjj3
Copy link
Collaborator

We have built the SeMask-FPN on top of the original MMSegmentation repo found here: https://github.com/open-mmlab/mmsegmentation. I am trying to understand if the issue is with an operation used in SeMask or with the original mmsegmentation library codebase. If the issue is with the original mmsegmentation library, opening an issue on their GitHub will help us quickly solve the problem.

So, could you try running the script for a Swin-based model in the original MMSegmentation codebase?

@romil611
Copy link
Author

Oh, that's what you meant.
I had actually tried doing that. The Swin models added the roll function. Support for that op was added in torch 1.10 but when using that version another feature in mmsegmentation gave issues which had actually been removed in torch 1.9. This is the error with torch 1.10:
AttributeError: 'super' object has no attribute '_specify_ddp_gpu_num'
issue link with torch.roll related discussion: pytorch/pytorch#46586

@praeclarumjj3
Copy link
Collaborator

Right, this seems to be a version-related issue. We use torch v1.8.0, which might not support the conversion. Did you try the conversion for the Swin-UPerNet model?

@praeclarumjj3
Copy link
Collaborator

praeclarumjj3 commented Mar 29, 2022

Right, this seems to be a version-related issue. We use torch v1.8.0, which might not support the conversion. Did you try the conversion for the Swin-UPerNet model?

I remember encountering the _specify_ddp_gpu_num AttributeError when using a non-compatible version of mmsegmentation.

@romil611
Copy link
Author

Did you try the conversion for the Swin-UPerNet model?
Yes, Swin-T found here: https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation

@praeclarumjj3
Copy link
Collaborator

Did it run error-free?

@romil611
Copy link
Author

No, I got similar issues there as well. Altho, I didn't try adding the roll operator when trying that code.

@praeclarumjj3
Copy link
Collaborator

Right, it's probably an MMSegmentation issue. I suggest you open an issue on the official repo: https://github.com/open-mmlab/mmsegmentation/tree/master/mmseg.

The library maintainers should be able to help find the exact problem.

@romil611
Copy link
Author

romil611 commented Mar 29, 2022

One more issue that I found:
When running the demo file present here: https://github.com/Picsart-AI-Research/SeMask-Segmentation/blob/main/SeMask-FPN/demo/demo.py, the function show_result_pyplot ends up calling show_result.

One of them needs slight modifications. I modified the latter for quick testing. Do you wish me to open a new issue?

@praeclarumjj3
Copy link
Collaborator

praeclarumjj3 commented Mar 29, 2022

Ohh, is it, what kind of modifications? You can mention those here. I will fix the issues.

@romil611
Copy link
Author

romil611 commented Mar 29, 2022

I just changed the definition to match how it was being called and 1-2 more changes. Also removed Opacity. I wanted to test on my own images and that felt the fastest way. This function is probably being called from other places as well and my temporary solution would probably break other things. Best would be to modify the functions calling show_result.
My temporary fix if you are interested:

def show_result(self,
                    img,
                    result,
                    palette=None,
                    win_name='',
                    show=False,
                    wait_time=0,
                    out_file="result"):
        """Draw `result` over `img`.

        Args:
            img (str or Tensor): The image to be displayed.
            result (Tensor): The semantic segmentation results to draw over
                `img`.
            palette (list[list[int]]] | np.ndarray | None): The palette of
                segmentation map. If None is given, random palette will be
                generated. Default: None
            win_name (str): The window name.
            wait_time (int): Value of waitKey param.
                Default: 0.
            show (bool): Whether to show the image.
                Default: False.
            out_file (str or None): The filename to write the image.
                Default: None.

        Returns:
            img (Tensor): Only if not `show` or `out_file`
        """
        
        assert len(self.CLASSES) in [19, 150, 171]
        
        img = mmcv.imread(img)
        img = img.copy()
        h, w = img.shape[:2]
        seg = result[0][0] #result[0]
        seg = mmcv.imresize(seg, (w, h), interpolation='nearest')
        if palette is None:
            if self.PALETTE is None:
                palette = np.random.randint(
                    0, 255, size=(len(self.CLASSES), 3))
            else:
                palette = self.PALETTE
        palette = np.array(palette)
        assert palette.shape[0] == len(self.CLASSES)
        assert palette.shape[1] == 3
        assert len(palette.shape) == 2
        color_seg = np.zeros((seg.shape[0], seg.shape[1], 3), dtype=np.uint8)
        for label, color in enumerate(palette):
            color_seg[seg == label, :] = color
        # convert to BGR
        color_seg = color_seg[..., ::-1]

        pred = color_seg.copy()
        color_seg = img * 0.5 + color_seg * 0.5
        img = img.astype(np.uint8)
        color_seg = color_seg.astype(np.uint8)
        pred = pred.astype(np.uint8)
        # if out_file specified, do not show image in window
        if out_file is not None:
            show = False

            save_file_pred = os.path.join(out_file + '_PRED.png')
            save_file_img = os.path.join(out_file + '_IMG.png')
            save_file_gt = os.path.join(out_file + '_GT.png')
            save_file_overlap = os.path.join(out_file + '_OVERLAP.png')
        '''
        if len(self.CLASSES) == 19:
            gt_file = out_file[1].replace('_leftImg8bit.png', '_gtFine_labelIds.png')
            gt_file = gt_file.replace('/leftImg8bit/', '/gtFine/')   
        elif len(self.CLASSES) == 150:
            #gt_file = out_file[1].replace('/images/', '/annotations/')
            #gt_file = gt_file.replace('.jpg', '.png')
            gt_file = out_file
        elif len(self.CLASSES) == 171:
            gt_file = out_file[1].replace('/images/', '/annotations/')
            gt_file = gt_file.replace('.jpg', '_labelTrainIds.png')
        
        gt = mmcv.imread(gt_file, flag='grayscale')
        gt = mmcv.imresize(gt, (w, h), interpolation='nearest')
        
        if len(self.CLASSES) == 19:
            gt = id2trainId(gt, id_to_trainid)
        elif len(self.CLASSES) == 150:
            gt = gt - 1
        elif len(self.CLASSES) == 171:
            gt = gt - 1
            
        color_gt = np.zeros((gt.shape[0], gt.shape[1], 3), dtype=np.uint8)
        for label, color in enumerate(palette):
            color_gt[gt == label, :] = color
        # convert to BGR
        color_gt = color_gt[..., ::-1]

        color_gt = color_gt.astype(np.uint8)
        '''
        if show:
            mmcv.imshow(color_seg, win_name, wait_time)
        if out_file is not None:
            mmcv.imwrite(color_seg, save_file_overlap)
            mmcv.imwrite(img, save_file_img)
            #mmcv.imwrite(color_gt, save_file_gt)
            mmcv.imwrite(pred, save_file_pred)

        if not (show or out_file):
            warnings.warn('show==False and out_file is not specified, only '
                          'result image will be returned')
        return color_seg#img

@praeclarumjj3
Copy link
Collaborator

I am sorry, but I cannot understand the reason for the changes.

@romil611
Copy link
Author

romil611 commented Mar 30, 2022

I removed i from the definition because it was not being passed and I didn't need it and also provided a default value for output_file.
I changed seg = result[0][0] from seg = result[0] because the format was different.
I removed opacity from the functions calling this one. if you want opacity then you can modify this line color_seg = img * 0.5 + color_seg * 0.5.
Also, I'm always returning color_seg instead of conditionally returning img.

@praeclarumjj3
Copy link
Collaborator

Hi @romil611, thanks for pointing the inconsistency out. I had already included a different method for inference in the code but missed changing the call in the GitHub repo.

img = model.show_inference_result(

I define the show_inference_result in the base.py file:

def show_inference_result(self,

@romil611
Copy link
Author

Oh there was a function ready to use just above it. i missed it lol.
anyways, glad could help.

@praeclarumjj3
Copy link
Collaborator

I am closing this issue since the ONNX problem stems from the mmsegmentation library, in my opinion. Feel free to reopen the issue if you have any more questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants