Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can dpt models be traced? #42

Open
3togo opened this issue Jul 27, 2021 · 20 comments
Open

Can dpt models be traced? #42

3togo opened this issue Jul 27, 2021 · 20 comments

Comments

@3togo
Copy link

3togo commented Jul 27, 2021

I try to trace "dpt_hybrid_midas" by calling

torch.jit.trace(model, example_input)

However, it failed with error messages below.
Any pointer on how to do it properly?

/usr/local/lib/python3.9/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
/mnt/data/git/DPT/dpt/vit.py:154: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
gs_old = int(math.sqrt(len(posemb_grid)))
/usr/local/lib/python3.9/dist-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
warnings.warn(
Traceback (most recent call last):
File "/mnt/data/git/DPT/export_model.py", line 112, in
convert(in_model_path, out_model_path)
File "/mnt/data/git/DPT/export_model.py", line 64, in convert
sm = torch.jit.trace(model, example_input)
File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 735, in trace
return trace_module(
File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 952, in trace_module
module._c._create_method_from_trace(
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1039, in _slow_forward
result = self.forward(*input, **kwargs)
File "/mnt/data/git/DPT/dpt/models.py", line 115, in forward
inv_depth = super().forward(x).squeeze(dim=1)
File "/mnt/data/git/DPT/dpt/models.py", line 72, in forward
layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x)
File "/mnt/data/git/DPT/dpt/vit.py", line 120, in forward_vit
nn.Unflatten(
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 102, in init
self._require_tuple_int(unflattened_size)
File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 125, in _require_tuple_int
raise TypeError("unflattened_size must be tuple of ints, " +
TypeError: unflattened_size must be tuple of ints, but found element of type Tensor at pos 0

@ranftlr
Copy link
Contributor

ranftlr commented Jul 29, 2021

The current model isn't traceable unfortunately. As this is a rather popular request (see also isl-org/MiDaS#122) we are working on a rewrite to fix this.

@3togo
Copy link
Author

3togo commented Jul 30, 2021

ranftlr,

many thanks for your prompt reply.

eli

@ranftlr
Copy link
Contributor

ranftlr commented Aug 5, 2021

I just pushed a preview of a scriptable and traceable model to branch "dpt_scriptable": https://github.com/isl-org/DPT/tree/dpt_scriptable. Note that you have to download updated weight files for this to work. You can find updated links in the README of the branch.

Please let us know if this solves your problem or if you experience any issues with this.

@phamdat09
Copy link

@ranftlr
Thanks for your works.
This code does not work with torch.onnx. Can you see it ? Thanks

@3togo
Copy link
Author

3togo commented Sep 13, 2021

@ranftlr ,
I try to trace your "dpt_hybrid-midas-d889a10e.pt" using torch.jit.trace but failed

Below is the error message:
File "/usr/local/lib/python3.9/dist-packages/torch/_tensor.py", line 867, in unflatten
return super(Tensor, self).unflatten(dim, sizes, names)
RuntimeError: NYI: Named tensors are not supported with the tracer

errors.txt

@AbdouSarr
Copy link

is there a fix for this yet @ranftlr ? thank you

@Wing100
Copy link

Wing100 commented Oct 15, 2021

I try to trace "dpt_hybrid_midas" by calling

torch.jit.trace(model, example_input)

However, it failed with error messages below. Any pointer on how to do it properly?

/usr/local/lib/python3.9/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other) /mnt/data/git/DPT/dpt/vit.py:154: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. gs_old = int(math.sqrt(len(posemb_grid))) /usr/local/lib/python3.9/dist-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. warnings.warn( Traceback (most recent call last): File "/mnt/data/git/DPT/export_model.py", line 112, in convert(in_model_path, out_model_path) File "/mnt/data/git/DPT/export_model.py", line 64, in convert sm = torch.jit.trace(model, example_input) File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 735, in trace return trace_module( File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 952, in trace_module module._c._create_method_from_trace( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1039, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/data/git/DPT/dpt/models.py", line 115, in forward inv_depth = super().forward(x).squeeze(dim=1) File "/mnt/data/git/DPT/dpt/models.py", line 72, in forward layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x) File "/mnt/data/git/DPT/dpt/vit.py", line 120, in forward_vit nn.Unflatten( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 102, in init self._require_tuple_int(unflattened_size) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 125, in _require_tuple_int raise TypeError("unflattened_size must be tuple of ints, " + TypeError: unflattened_size must be tuple of ints, but found element of type Tensor at pos 0

I try to trace "dpt_hybrid_midas" by calling

torch.jit.trace(model, example_input)

However, it failed with error messages below. Any pointer on how to do it properly?

/usr/local/lib/python3.9/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other) /mnt/data/git/DPT/dpt/vit.py:154: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results. gs_old = int(math.sqrt(len(posemb_grid))) /usr/local/lib/python3.9/dist-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. warnings.warn( Traceback (most recent call last): File "/mnt/data/git/DPT/export_model.py", line 112, in convert(in_model_path, out_model_path) File "/mnt/data/git/DPT/export_model.py", line 64, in convert sm = torch.jit.trace(model, example_input) File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 735, in trace return trace_module( File "/usr/local/lib/python3.9/dist-packages/torch/jit/_trace.py", line 952, in trace_module module._c._create_method_from_trace( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1039, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/data/git/DPT/dpt/models.py", line 115, in forward inv_depth = super().forward(x).squeeze(dim=1) File "/mnt/data/git/DPT/dpt/models.py", line 72, in forward layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x) File "/mnt/data/git/DPT/dpt/vit.py", line 120, in forward_vit nn.Unflatten( File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 102, in init self._require_tuple_int(unflattened_size) File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/flatten.py", line 125, in _require_tuple_int raise TypeError("unflattened_size must be tuple of ints, " + TypeError: unflattened_size must be tuple of ints, but found element of type Tensor at pos 0

Hello,I also encountered the same problem. Has this problem been solved, please?

@guillesanbri
Copy link

Hi, I have been trying to export DPT-Hybrid to onnx today using the dpt_scriptable branch and also encountered RuntimeError: NYI: Named tensors are not supported with the tracer. I found this pytorch issue which looks the same. The problem is the usage of unflatten. I have succesfully exported the onnx model removing these two unflatten calls (vit.py, lines ~320)

layer_3 = self.act_postprocess3(layer_3.unflatten(2, out_size))
layer_4 = self.act_postprocess4(layer_4.unflatten(2, out_size))

and using view instead

x3, y3, z3 = layer_3.shape
layer_3 = self.act_postprocess3(layer_3.view(x3, y3, *out_size))
x4, y4, z4 = layer_4.shape
layer_4 = self.act_postprocess4(layer_4.view(x4, y4, *out_size))

The hybrid model doesn't need to convert layer1 and layer2, but the same solution probably applies.

I will test further and comment back soon.

@Wing100
Copy link

Wing100 commented Oct 18, 2021

@guillesanbri If there is a pre-trained model trained by others in the network, such as resnet50, can dpt models be traced?

RuntimeError: Error(s) in loading state_dict for net:
Missing key(s) in state_dict:
Unexpected key(s) in state_dict:

@guillesanbri
Copy link

@Wing100 I'm not sure what are you referring to, I traced the Hybrid model which has a ResNet50 inside. The error you got seems to be related to loading model parameters from another model without setting strict=False, but afaik that is not related to tracing the model.

@romil611
Copy link

@guillesanbri Hi, after making changing from unflatten to view, i get the following error:

RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.

did you encounter this and/or do you know how to solve this? thanks in advance!

@guillesanbri
Copy link

@romil611 I think I got that error when playing with the dynamic axes of onnx export. My use case doesn't need dynamic axes so I have set their size static for now. Will ping you if I get back to that.

@romil611
Copy link

@guillesanbri I also need static sizes and didn't add the dynamic axes option in the torch.onnx.export call. My guess was that dynamic axis were being used somewhere inside which his causing the issue. If you remember anything related to it then do tell.
Anyways, Thanks for the reply!!

@ghost
Copy link

ghost commented Nov 17, 2021

@guillesanbri Hi, after making changing from unflatten to view, i get the following error:

RuntimeError: Unsupported: ONNX export of transpose for tensor of unknown rank.

did you encounter this and/or do you know how to solve this? thanks in advance!

@romil611 I saw this error when I was calling torch.onnx.export on the scripted version of the model. Make sure you don't have

model = torch.jit.script(model)

anywhere preceding your export call.

For me, the other secret for a successful export (in addition to the edits @guillesanbri has already suggested) was to keep everything on the CPU. According to this comment, the device the model was running on when exported does not affect the resultant onnx model.

@romil611
Copy link

romil611 commented Nov 17, 2021

For me the torch.export worked with the main brach itself when I tried to change unflatten to view.

@3togo
Copy link
Author

3togo commented Nov 20, 2021

Thank you for efforts due to so many peoples. The problem is fixed by using the latest version of pytorch.

@3togo 3togo closed this as completed Nov 20, 2021
@jucic
Copy link

jucic commented Dec 7, 2021

Hi, I have been trying to export DPT-Hybrid to onnx today using the dpt_scriptable branch and also encountered RuntimeError: NYI: Named tensors are not supported with the tracer. I found this pytorch issue which looks the same. The problem is the usage of unflatten. I have succesfully exported the onnx model removing these two unflatten calls (vit.py, lines ~320)

layer_3 = self.act_postprocess3(layer_3.unflatten(2, out_size))
layer_4 = self.act_postprocess4(layer_4.unflatten(2, out_size))

and using view instead

x3, y3, z3 = layer_3.shape
layer_3 = self.act_postprocess3(layer_3.view(x3, y3, *out_size))
x4, y4, z4 = layer_4.shape
layer_4 = self.act_postprocess4(layer_4.view(x4, y4, *out_size))

The hybrid model doesn't need to convert layer1 and layer2, but the same solution probably applies.

I will test further and comment back soon.

I tried to export DPT-Hybrid to onnx today using the dpt_scriptable branch, however encountered with the following issue:
image
do you know why? it seems a bug in the model returned by timm.create_model("vit_base_resnet50_384", pretrained=pretrained)
I tried change x = self.model.patch_embed.backbone(x) to x = self.model.patch_embed.backbone(x.contiguous())
however it doesn't work, do you know what's the problem? thanks ahead!

I solved the above problem by downgrade timm. but I encounterd with another problem: Exporting the operator std_mean to ONNX opset version 12 is not supported. Please open a bug to request ONNX export support for the missing operator. anyone knows how to solve it?

@Tord-Zhang
Copy link

Tord-Zhang commented Mar 1, 2022

@guillesanbri @ranftlr it seems that the converted onnx model can only support input with static size? The patch size cannot be changed if the model is converted to onnx

@3togo
Copy link
Author

3togo commented Mar 22, 2023

I got the following errors when I try to trace "dpt_beit_large_384.pt".

Any help?

Traceback (most recent call last):
  File "/work/gitee/MiDaS-cpp/python/export_model.py", line 162, in <module>
    convert(in_model_type, in_model_path, out_model_path)
  File "/work/gitee/MiDaS-cpp/python/export_model.py", line 84, in convert
    sm = torch.jit.trace(model, sample, strict=False)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eli/.local/lib/python3.11/site-packages/torch/jit/_trace.py", line 794, in trace
    return trace_module(
           ^^^^^^^^^^^^^
  File "/home/eli/.local/lib/python3.11/site-packages/torch/jit/_trace.py", line 1084, in trace_module
    _check_trace(
  File "/home/eli/.local/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/eli/.local/lib/python3.11/site-packages/torch/jit/_trace.py", line 562, in _check_trace
    raise TracingCheckError(*diag_info)
torch.jit._trace.TracingCheckError: Tracing failed sanity checks!
ERROR: Graphs differed across invocations!
	Graph diff:
		  graph(%self.1 : __torch__.midas.dpt_depth.DPTDepthModel,
		        %x.1 : Tensor):
		    %scratch : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %output_conv : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="output_conv"](%scratch)
		    %scratch.15 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet1 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet1"](%scratch.15)
		    %scratch.13 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet2 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet2"](%scratch.13)
		    %scratch.11 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet3 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet3"](%scratch.11)
		    %scratch.9 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %refinenet4 : __torch__.midas.blocks.FeatureFusionBlock_custom = prim::GetAttr[name="refinenet4"](%scratch.9)
		    %scratch.7 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer4_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer4_rn"](%scratch.7)
		    %scratch.5 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer3_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer3_rn"](%scratch.5)
		    %scratch.3 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer2_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer2_rn"](%scratch.3)
		    %scratch.1 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="scratch"](%self.1)
		    %layer1_rn : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="layer1_rn"](%scratch.1)
		    %pretrained : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="pretrained"](%self.1)
		    %act_postprocess4 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="act_postprocess4"](%pretrained)
		    %_4.7 : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="4"](%act_postprocess4)
		    %pretrained.83 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="pretrained"](%self.1)
		    %act_postprocess4.5 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="act_postprocess4"](%pretrained.83)
		    %_3.9 : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="3"](%act_postprocess4.5)
		    %pretrained.81 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="pretrained"](%self.1)
		    %act_postprocess3 : __torch__.torch.nn.modules.container.Sequential = prim::GetAttr[name="act_postprocess3"](%pretrained.81)

@3togo 3togo reopened this Mar 22, 2023
@foemre
Copy link

foemre commented Apr 10, 2023

isl-org/MiDaS#189
I can verify that dpt_large_384.pt in MiDaS v3.1 can be traced using torch.jit.trace, but I cannot export the model to ONNX. I'm receiving RuntimeError: Input type (float) and bias type (c10::Half) should be the same. Has anyone had any experience exporting the latest models to ONNX?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants