How to convert DINOv2 to ONNX? #216

PeterKim1 · 2023-09-15T04:38:52Z

Hi. Thanks for your great works.

I want to convert dinov2 to onnx, but failed.

I try to refer #19 this Issue.

I apply #19 (comment) this, after that, #19 (comment) this error occur.

So, I try to apply #19 (comment) this, but error still occur.

Are there any guidelines for onnx converting?

I need to use this model quickly for semantic segmentation tasks.

Thanks.

seddonm1 · 2023-09-21T06:48:38Z

changing this bit in vision_transformer.py on line 187 will allow export.

patch_pos_embed = nn.functional.interpolate(
    patch_pos_embed.reshape(1, int(math.sqrt(N)), int(math.sqrt(N)), dim).permute(0, 3, 1, 2),
    scale_factor=(float(w0 / math.sqrt(N)), float(h0 / math.sqrt(N))),
    mode="bicubic",
)

dacquaviva · 2023-09-25T13:34:35Z

thanks @seddonm1 fort the work around, it works for me for batch size 1. However, I am trying to have dynamic batch size, i am able to convert the model to ONNX with dynamic batch size, but when I load it I get error, anyone managed?

barbolo · 2024-04-01T18:36:21Z

I've exported class token + patch tokens: #167 (comment)

barbolo · 2024-04-06T23:30:02Z

I've tried to export to ONNX using dynamic input and output shapes. The model is exported and seems fine, however the ONNX model throws an exception during inference when the input is not the same as the input sample fed during the export.

For example, when I export a model with an input with shape [1, 3, 168, 168] (batch_size x C x H x W) the last hidden state (class token + patch token) has 145 features. When I try to use that model with an input with shape [1, 3, 112, 112 ] (which should output 65 features), the following exception is thrown:

[E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 65 by 145

WulongGuo · 2024-04-24T01:42:00Z

I've tried to export to ONNX using dynamic input and output shapes. The model is exported and seems fine, however the ONNX model throws an exception during inference when the input is not the same as the input sample fed during the export.

For example, when I export a model with an input with shape [1, 3, 168, 168] (batch_size x C x H x W) the last hidden state (class token + patch token) has 145 features. When I try to use that model with an input with shape [1, 3, 112, 112 ] (which should output 65 features), the following exception is thrown:

[E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 65 by 145

@barbolo , hello, I got the same error. Have you figured out how to solve this porblem?

barbolo · 2024-04-24T12:57:51Z

@WulongGuo no, I haven't. And I'm not sure there is a solution. I've seen other ViT like repositories with downloadable ONNX/OpenVINO models and all of them have fixed input shapes.

For my use case, I'm interested in reducing the inference time, so I've exported the model for the input shapes I'm using and I'm loading them in memory. This approach uses more memory, but the inference time is optimized.

WulongGuo · 2024-04-25T01:27:09Z

@barbolo ok, thanks for your reply. I would just use the fixed-input version.

Zalways · 2024-04-29T02:05:28Z

I've tried to export to ONNX using dynamic input and output shapes. The model is exported and seems fine, however the ONNX model throws an exception during inference when the input is not the same as the input sample fed during the export.

For example, when I export a model with an input with shape [1, 3, 168, 168] (batch_size x C x H x W) the last hidden state (class token + patch token) has 145 features. When I try to use that model with an input with shape [1, 3, 112, 112 ] (which should output 65 features), the following exception is thrown:

[E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 65 by 145

i also met this problem, but the model can inference with difference input shapes in python before exported, but when i exported the onnx model, the onnx model only can inference on the input that have same w&h, can't inference on other shapes

Zalways · 2024-04-29T02:07:04Z

i'll appreciate if you could solve this problem

100rab-S · 2024-05-30T07:13:18Z

@Zalways @barbolo

i also met this problem, but the model can inference with difference input shapes in python before exported, but when i exported the onnx model, the onnx model only can inference on the input that have same w&h, can't inference on other shapes

I'm facing the same issue. It is something with the nodes in the model. Although we make the image shape dimensions dynamic during export, those are somehow still static in the ONNX model. Then during inference, it throws the above-mentioned error. This is also reflected in the warnings while exporting to the ONNX model. Seems like we can only use static shapes (although the model can be exported with dynamic axes)!

The downside of this issue is we have to now downscale or upscale the images to a static shape 😢.
@patricklabatut I also tried exporting using the latest dynamo_export from Torch, but that failed too. Has anyone successfully exported a fully dynamic ONNX model?

/Users/sourabh/Library/Caches/pypoetry/virtualenvs/devit-KwOi2sgN-py3.8/lib/python3.8/site-packages/torch/onnx/utils.py:1548: OnnxExporterWarning: Exporting to ONNX opset version 19 is not supported. by 'torch.onnx.export()'. The highest opset version supported is 17. To use a newer opset version, consider 'torch.onnx.dynamo_export()'. Note that dynamo_export() is in preview. Please report errors with dynamo_export() as Github issues to https://github.com/pytorch/pytorch/issues.
warnings.warn(
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/patch_embed.py:72: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert H % patch_H == 0, f"Input image height {H} is not a multiple of patch height {patch_H}"
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/patch_embed.py:73: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert W % patch_W == 0, f"Input image width {W} is not a multiple of patch width: {patch_W}"
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:183: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if npatch == N and w == h:
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:195: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
sqrt_N = math.sqrt(N)
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:196: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
sx, sy = float(w0) / sqrt_N, float(h0) / sqrt_N
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:204: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(w0) == patch_pos_embed.shape[-2]
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:204: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(w0) == patch_pos_embed.shape[-2]
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:205: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(h0) == patch_pos_embed.shape[-1]
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:205: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(h0) == patch_pos_embed.shape[-1]

Zalways · 2024-06-05T07:59:59Z

@Zalways @barbolo

i also met this problem, but the model can inference with difference input shapes in python before exported, but when i exported the onnx model, the onnx model only can inference on the input that have same w&h, can't inference on other shapes

I'm facing the same issue. It is something with the nodes in the model. Although we make the image shape dimensions dynamic during export, those are somehow still static in the ONNX model. Then during inference, it throws the above-mentioned error. This is also reflected in the warnings while exporting to the ONNX model. Seems like we can only use static shapes (although the model can be exported with dynamic axes)!

The downside of this issue is we have to now downscale or upscale the images to a static shape 😢. @patricklabatut I also tried exporting using the latest dynamo_export from Torch, but that failed too. Has anyone successfully exported a fully dynamic ONNX model?

/Users/sourabh/Library/Caches/pypoetry/virtualenvs/devit-KwOi2sgN-py3.8/lib/python3.8/site-packages/torch/onnx/utils.py:1548: OnnxExporterWarning: Exporting to ONNX opset version 19 is not supported. by 'torch.onnx.export()'. The highest opset version supported is 17. To use a newer opset version, consider 'torch.onnx.dynamo_export()'. Note that dynamo_export() is in preview. Please report errors with dynamo_export() as Github issues to https://github.com/pytorch/pytorch/issues.
warnings.warn(
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/patch_embed.py:72: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert H % patch_H == 0, f"Input image height {H} is not a multiple of patch height {patch_H}"
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/patch_embed.py:73: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert W % patch_W == 0, f"Input image width {W} is not a multiple of patch width: {patch_W}"
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:183: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if npatch == N and w == h:
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:195: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
sqrt_N = math.sqrt(N)
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:196: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
sx, sy = float(w0) / sqrt_N, float(h0) / sqrt_N
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:204: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(w0) == patch_pos_embed.shape[-2]
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:204: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(w0) == patch_pos_embed.shape[-2]
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:205: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(h0) == patch_pos_embed.shape[-1]
/Users/sourabh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:205: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert int(h0) == patch_pos_embed.shape[-1]

have you solved this problem?

100rab-S · 2024-06-09T12:56:32Z

@Zalways Nope.

huangcj-code · 2024-06-10T11:52:42Z

@100rab-S Hello, I've encountered a similar issue myself. I'm curious to know, would there be any adverse effects if I resize the images to match the static input size?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to convert DINOv2 to ONNX? #216

How to convert DINOv2 to ONNX? #216

PeterKim1 commented Sep 15, 2023

seddonm1 commented Sep 21, 2023

dacquaviva commented Sep 25, 2023 •

edited

Loading

barbolo commented Apr 1, 2024 •

edited

Loading

barbolo commented Apr 6, 2024 •

edited

Loading

WulongGuo commented Apr 24, 2024 •

edited

Loading

barbolo commented Apr 24, 2024 •

edited

Loading

WulongGuo commented Apr 25, 2024

Zalways commented Apr 29, 2024

Zalways commented Apr 29, 2024

100rab-S commented May 30, 2024

Zalways commented Jun 5, 2024

100rab-S commented Jun 9, 2024

huangcj-code commented Jun 10, 2024

How to convert DINOv2 to ONNX? #216

How to convert DINOv2 to ONNX? #216

Comments

PeterKim1 commented Sep 15, 2023

seddonm1 commented Sep 21, 2023

dacquaviva commented Sep 25, 2023 • edited Loading

barbolo commented Apr 1, 2024 • edited Loading

barbolo commented Apr 6, 2024 • edited Loading

WulongGuo commented Apr 24, 2024 • edited Loading

barbolo commented Apr 24, 2024 • edited Loading

WulongGuo commented Apr 25, 2024

Zalways commented Apr 29, 2024

Zalways commented Apr 29, 2024

100rab-S commented May 30, 2024

Zalways commented Jun 5, 2024

100rab-S commented Jun 9, 2024

huangcj-code commented Jun 10, 2024

dacquaviva commented Sep 25, 2023 •

edited

Loading

barbolo commented Apr 1, 2024 •

edited

Loading

barbolo commented Apr 6, 2024 •

edited

Loading

WulongGuo commented Apr 24, 2024 •

edited

Loading

barbolo commented Apr 24, 2024 •

edited

Loading