关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 #11342

xiyangyang99 · 2024-01-04T08:24:06Z

利用mmdetection微调自己的本地数据集之后，GroundingDINO在本地推理的时间为200-300毫秒，转为onnx推理时间为在cpu上5s一张，在GPU上推理onnx大约2s一张图像，当把onnx模型转为tensorrt模型，官方的trtexec 工具无法转换为--fp16 的trt模型。请问 mmdetection有准备将groundingDINO加速的打算吗？

hhaAndroid · 2024-01-04T08:40:23Z

这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr

xiyangyang99 · 2024-01-04T08:48:12Z

这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr

我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。

hhaAndroid · 2024-01-04T09:28:53Z

如果你们有兴趣来支持，那非常好呀，你们来主导支持，然后碰到问题和 mmdeploy 人员进行沟通。

wxz1996 · 2024-01-04T12:01:49Z

这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr

我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。

请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错

xiyangyang99 · 2024-01-04T12:54:17Z

我是用trtexec转的。不是脚本转的。

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

wxz1996 · 2024-01-04T13:45:13Z

我是用trtexec转的。不是脚本转的。
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

你的pytorch转onnx是将整个grounding dino全部转了吗

xiyangyang99 · 2024-01-04T23:36:35Z

是的

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

wxz1996 · 2024-01-05T03:54:19Z

是的
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.>

我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是：
trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024
大佬能给个转onnx的脚本吗

xiyangyang99 · 2024-01-05T04:21:54Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

wxz1996 · 2024-01-05T05:35:14Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

我也是8.6.1.6，cuda是12.3，你转trt的命令是？

xiyangyang99 · 2024-01-06T13:32:26Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: _@**._>; 抄送: _@.@._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>

我也是8.6.1.6，cuda是12.3，你转trt的命令是？
就是你的那个命令转的，trtexec --onnx=your onnx --saveEngine=you.engine --fp16

xiyangyang99 · 2024-01-06T13:33:46Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: _@**._>; 抄送: _@.@._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>

我也是8.6.1.6，cuda是12.3，你转trt的命令是？

engine我倒是转出来了，但是用python写推理脚本的时候，engine的反序列化出bug了。还在看怎么推理engine文件。

xiyangyang99 · 2024-01-06T13:36:47Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: _@**._>; 抄送: _@.@._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>

我也是8.6.1.6，cuda是12.3，你转trt的命令是？

这是我转出来的结果。

wxz1996 · 2024-01-06T13:38:01Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: _@**._>; 抄送: _@.@._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>

我也是8.6.1.6，cuda是12.3，你转trt的命令是？

这是我转出来的结果。

我直接转出来跟onnx的输出无法对齐，你这个可以吗

xiyangyang99 · 2024-01-06T13:45:17Z

你的环境是什么？我用的tensorrt版本是8.6.1.6。CUDA11 .7
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月5日(周五) 中午11:54 收件人: _@**._>; 抄送: _@.@._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 是的 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持，他们目前初步计划是 1月份，但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子，在RTX3090上推理时间40ms，但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗？我们团队正在做groundingdino的tensorrt加速适配。请问能借鉴下GroundingDINO onnx转trt的python脚本吗，我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 我这边转出来还是有些问题，不知道是不是onnx转错了，我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是： trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024 大佬能给个转onnx的脚本吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>

我也是8.6.1.6，cuda是12.3，你转trt的命令是？

这是我转出来的结果。

我直接转出来跟onnx的输出无法对齐，你这个可以吗

你有python端的推理脚本嘛？用的动态输入还是静态输入？我这边推理代码还没写完。所以还不得而知。

wxz1996 · 2024-01-06T14:10:48Z

@QzYER
export_trt.zip
这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐

xiyangyang99 · 2024-01-07T00:46:20Z

我先看看。

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年1月6日(周六) 晚上10:11 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export_trt.zip 这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

xiyangyang99 · 2024-01-08T01:26:31Z

@QzYER export_trt.zip 这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐

你测试了fp16没有是否对其？我这边刚刚测试了一下，也无法对齐精度问题。你测试的时候输入的不同文本，其中input_ids
、attension_mask、 position_ids 等这几个输入参数是否对应？

xiyangyang99 · 2024-01-08T02:44:35Z

@QzYER export_trt.zip 这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐

我刚刚测试了我的trt推理，输出是这个样子的。输出精度也无法对齐，fp16精度的。

wxz1996 · 2024-01-08T06:36:20Z

@QzYER export_trt.zip 这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐

你测试了fp16没有是否对其？我这边刚刚测试了一下，也无法对齐精度问题。你测试的时候输入的不同文本，其中input_ids 、attension_mask、 position_ids 等这几个输入参数是否对应？

都是对应的，你看我的脚本里面，onnx和trt输入完全一样，fp32/fp16都是对不齐

xiyangyang99 · 2024-01-08T06:46:26Z

我看了你的推理代码，我的也一样，输出无法对齐。这是我的输出情况。

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月8日(星期一) 下午2:36 ***@***.***>; ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) @QzYER export_trt.zip 这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐你测试了fp16没有是否对其？我这边刚刚测试了一下，也无法对齐精度问题。你测试的时候输入的不同文本，其中input_ids 、attension_mask、 position_ids 等这几个输入参数是否对应？都是对应的，你看我的脚本里面，onnx和trt输入完全一样，fp32/fp16都是对不齐 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

xiyangyang99 · 2024-01-08T06:52:33Z

如果pytorchoronnx  无法和tensorrt对齐的话，只有通过pytorch在某一层的输出和onnx相对应的某节点输出的tensor做对比，来先定位是哪个部分精度无法对齐。然后再找算子去替换掉。我已经反馈给tensorrt官方了。看他们怎么看。

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月8日(星期一) 下午2:36 ***@***.***>; ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) @QzYER export_trt.zip 这是我的推理脚本，里面有onnx和trt，我是静态输入，动态的搞不定，目前fp32的trt跟onnx对不齐你测试了fp16没有是否对其？我这边刚刚测试了一下，也无法对齐精度问题。你测试的时候输入的不同文本，其中input_ids 、attension_mask、 position_ids 等这几个输入参数是否对应？都是对应的，你看我的脚本里面，onnx和trt输入完全一样，fp32/fp16都是对不齐 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

Baboom-l · 2024-01-09T03:07:02Z

使用mmdeploy导出需要固定图像尺寸，text可以动态，这样FP32的Grounding DINO精度可以对齐

xiyangyang99 · 2024-01-09T07:11:56Z

使用mmdeploy导出需要固定图像尺寸，text可以动态，这样FP32的Grounding DINO精度可以对齐

这是我的转换过程。请问兄台是如何固定输入图像尺寸，text设为动态输入。把groundingdino转出来的呀？

(yolo8) bowen@bowen-MS-7D20:/media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy$ python /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/tools/deploy.py /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/configs/mmdet/detection/detection_tensorrt-fp16_dynamic-320x320-1344x1344.py /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/mmdetection/configs/grounding_dino/grounding_dino_swin-b_finetune_16xb2_1x_coco.py /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/grounding_dino_deploy/weights.pth /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/mmdetection/demo/demo.jpg --work-dir mmdeploy_model/groundingdino --device cuda --dump-info
01/09 03:06:03 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
01/09 03:06:03 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
01/09 03:06:04 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
01/09 03:06:05 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
01/09 03:06:05 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
Downloading tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 2.01kB/s]
Downloading vocab.txt: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 509kB/s]
Downloading tokenizer.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 466k/466k [00:00<00:00, 714kB/s]
Downloading tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 12.8kB/s]
Loads checkpoint by local backend from path: /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/grounding_dino_deploy/weights.pth
The model and loaded state dict do not match exactly

size mismatch for backbone.patch_embed.projection.weight: copying a param with shape torch.Size([96, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([128, 3, 4, 4]).
size mismatch for backbone.patch_embed.projection.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.patch_embed.norm.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.patch_embed.norm.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.0.norm1.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.0.norm1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.0.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 3]) from checkpoint, the shape in current model is torch.Size([529, 4]).
size mismatch for backbone.stages.0.blocks.0.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.0.blocks.0.attn.w_msa.qkv.weight: copying a param with shape torch.Size([288, 96]) from checkpoint, the shape in current model is torch.Size([384, 128]).
size mismatch for backbone.stages.0.blocks.0.attn.w_msa.qkv.bias: copying a param with shape torch.Size([288]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for backbone.stages.0.blocks.0.attn.w_msa.proj.weight: copying a param with shape torch.Size([96, 96]) from checkpoint, the shape in current model is torch.Size([128, 128]).
size mismatch for backbone.stages.0.blocks.0.attn.w_msa.proj.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.0.norm2.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.0.norm2.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.0.ffn.layers.0.0.weight: copying a param with shape torch.Size([384, 96]) from checkpoint, the shape in current model is torch.Size([512, 128]).
size mismatch for backbone.stages.0.blocks.0.ffn.layers.0.0.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.0.blocks.0.ffn.layers.1.weight: copying a param with shape torch.Size([96, 384]) from checkpoint, the shape in current model is torch.Size([128, 512]).
size mismatch for backbone.stages.0.blocks.0.ffn.layers.1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.1.norm1.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.1.norm1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.1.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 3]) from checkpoint, the shape in current model is torch.Size([529, 4]).
size mismatch for backbone.stages.0.blocks.1.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.0.blocks.1.attn.w_msa.qkv.weight: copying a param with shape torch.Size([288, 96]) from checkpoint, the shape in current model is torch.Size([384, 128]).
size mismatch for backbone.stages.0.blocks.1.attn.w_msa.qkv.bias: copying a param with shape torch.Size([288]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for backbone.stages.0.blocks.1.attn.w_msa.proj.weight: copying a param with shape torch.Size([96, 96]) from checkpoint, the shape in current model is torch.Size([128, 128]).
size mismatch for backbone.stages.0.blocks.1.attn.w_msa.proj.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.1.norm2.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.1.norm2.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.blocks.1.ffn.layers.0.0.weight: copying a param with shape torch.Size([384, 96]) from checkpoint, the shape in current model is torch.Size([512, 128]).
size mismatch for backbone.stages.0.blocks.1.ffn.layers.0.0.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.0.blocks.1.ffn.layers.1.weight: copying a param with shape torch.Size([96, 384]) from checkpoint, the shape in current model is torch.Size([128, 512]).
size mismatch for backbone.stages.0.blocks.1.ffn.layers.1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for backbone.stages.0.downsample.norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.0.downsample.norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.0.downsample.reduction.weight: copying a param with shape torch.Size([192, 384]) from checkpoint, the shape in current model is torch.Size([256, 512]).
size mismatch for backbone.stages.1.blocks.0.norm1.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.0.norm1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.0.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 6]) from checkpoint, the shape in current model is torch.Size([529, 8]).
size mismatch for backbone.stages.1.blocks.0.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.1.blocks.0.attn.w_msa.qkv.weight: copying a param with shape torch.Size([576, 192]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for backbone.stages.1.blocks.0.attn.w_msa.qkv.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.stages.1.blocks.0.attn.w_msa.proj.weight: copying a param with shape torch.Size([192, 192]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for backbone.stages.1.blocks.0.attn.w_msa.proj.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.0.norm2.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.0.norm2.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.0.ffn.layers.0.0.weight: copying a param with shape torch.Size([768, 192]) from checkpoint, the shape in current model is torch.Size([1024, 256]).
size mismatch for backbone.stages.1.blocks.0.ffn.layers.0.0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.1.blocks.0.ffn.layers.1.weight: copying a param with shape torch.Size([192, 768]) from checkpoint, the shape in current model is torch.Size([256, 1024]).
size mismatch for backbone.stages.1.blocks.0.ffn.layers.1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.1.norm1.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.1.norm1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.1.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 6]) from checkpoint, the shape in current model is torch.Size([529, 8]).
size mismatch for backbone.stages.1.blocks.1.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.1.blocks.1.attn.w_msa.qkv.weight: copying a param with shape torch.Size([576, 192]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for backbone.stages.1.blocks.1.attn.w_msa.qkv.bias: copying a param with shape torch.Size([576]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.stages.1.blocks.1.attn.w_msa.proj.weight: copying a param with shape torch.Size([192, 192]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for backbone.stages.1.blocks.1.attn.w_msa.proj.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.1.norm2.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.1.norm2.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.blocks.1.ffn.layers.0.0.weight: copying a param with shape torch.Size([768, 192]) from checkpoint, the shape in current model is torch.Size([1024, 256]).
size mismatch for backbone.stages.1.blocks.1.ffn.layers.0.0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.1.blocks.1.ffn.layers.1.weight: copying a param with shape torch.Size([192, 768]) from checkpoint, the shape in current model is torch.Size([256, 1024]).
size mismatch for backbone.stages.1.blocks.1.ffn.layers.1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.stages.1.downsample.norm.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.1.downsample.norm.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.1.downsample.reduction.weight: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
size mismatch for backbone.stages.2.blocks.0.norm1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.0.norm1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.0.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 12]) from checkpoint, the shape in current model is torch.Size([529, 16]).
size mismatch for backbone.stages.2.blocks.0.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.2.blocks.0.attn.w_msa.qkv.weight: copying a param with shape torch.Size([1152, 384]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for backbone.stages.2.blocks.0.attn.w_msa.qkv.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for backbone.stages.2.blocks.0.attn.w_msa.proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([512, 512]).
size mismatch for backbone.stages.2.blocks.0.attn.w_msa.proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.0.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.0.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.0.ffn.layers.0.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 512]).
size mismatch for backbone.stages.2.blocks.0.ffn.layers.0.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.blocks.0.ffn.layers.1.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([512, 2048]).
size mismatch for backbone.stages.2.blocks.0.ffn.layers.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.1.norm1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.1.norm1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.1.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 12]) from checkpoint, the shape in current model is torch.Size([529, 16]).
size mismatch for backbone.stages.2.blocks.1.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.2.blocks.1.attn.w_msa.qkv.weight: copying a param with shape torch.Size([1152, 384]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for backbone.stages.2.blocks.1.attn.w_msa.qkv.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for backbone.stages.2.blocks.1.attn.w_msa.proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([512, 512]).
size mismatch for backbone.stages.2.blocks.1.attn.w_msa.proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.1.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.1.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.1.ffn.layers.0.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 512]).
size mismatch for backbone.stages.2.blocks.1.ffn.layers.0.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.blocks.1.ffn.layers.1.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([512, 2048]).
size mismatch for backbone.stages.2.blocks.1.ffn.layers.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.2.norm1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.2.norm1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.2.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 12]) from checkpoint, the shape in current model is torch.Size([529, 16]).
size mismatch for backbone.stages.2.blocks.2.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.2.blocks.2.attn.w_msa.qkv.weight: copying a param with shape torch.Size([1152, 384]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for backbone.stages.2.blocks.2.attn.w_msa.qkv.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for backbone.stages.2.blocks.2.attn.w_msa.proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([512, 512]).
size mismatch for backbone.stages.2.blocks.2.attn.w_msa.proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.2.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.2.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.2.ffn.layers.0.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 512]).
size mismatch for backbone.stages.2.blocks.2.ffn.layers.0.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.blocks.2.ffn.layers.1.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([512, 2048]).
size mismatch for backbone.stages.2.blocks.2.ffn.layers.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.3.norm1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.3.norm1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.3.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 12]) from checkpoint, the shape in current model is torch.Size([529, 16]).
size mismatch for backbone.stages.2.blocks.3.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.2.blocks.3.attn.w_msa.qkv.weight: copying a param with shape torch.Size([1152, 384]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for backbone.stages.2.blocks.3.attn.w_msa.qkv.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for backbone.stages.2.blocks.3.attn.w_msa.proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([512, 512]).
size mismatch for backbone.stages.2.blocks.3.attn.w_msa.proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.3.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.3.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.3.ffn.layers.0.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 512]).
size mismatch for backbone.stages.2.blocks.3.ffn.layers.0.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.blocks.3.ffn.layers.1.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([512, 2048]).
size mismatch for backbone.stages.2.blocks.3.ffn.layers.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.4.norm1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.4.norm1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.4.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 12]) from checkpoint, the shape in current model is torch.Size([529, 16]).
size mismatch for backbone.stages.2.blocks.4.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.2.blocks.4.attn.w_msa.qkv.weight: copying a param with shape torch.Size([1152, 384]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for backbone.stages.2.blocks.4.attn.w_msa.qkv.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for backbone.stages.2.blocks.4.attn.w_msa.proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([512, 512]).
size mismatch for backbone.stages.2.blocks.4.attn.w_msa.proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.4.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.4.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.4.ffn.layers.0.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 512]).
size mismatch for backbone.stages.2.blocks.4.ffn.layers.0.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.blocks.4.ffn.layers.1.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([512, 2048]).
size mismatch for backbone.stages.2.blocks.4.ffn.layers.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.5.norm1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.5.norm1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.5.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 12]) from checkpoint, the shape in current model is torch.Size([529, 16]).
size mismatch for backbone.stages.2.blocks.5.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.2.blocks.5.attn.w_msa.qkv.weight: copying a param with shape torch.Size([1152, 384]) from checkpoint, the shape in current model is torch.Size([1536, 512]).
size mismatch for backbone.stages.2.blocks.5.attn.w_msa.qkv.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([1536]).
size mismatch for backbone.stages.2.blocks.5.attn.w_msa.proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([512, 512]).
size mismatch for backbone.stages.2.blocks.5.attn.w_msa.proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.5.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.5.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.blocks.5.ffn.layers.0.0.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([2048, 512]).
size mismatch for backbone.stages.2.blocks.5.ffn.layers.0.0.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.blocks.5.ffn.layers.1.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([512, 2048]).
size mismatch for backbone.stages.2.blocks.5.ffn.layers.1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.stages.2.downsample.norm.weight: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.downsample.norm.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([2048]).
size mismatch for backbone.stages.2.downsample.reduction.weight: copying a param with shape torch.Size([768, 1536]) from checkpoint, the shape in current model is torch.Size([1024, 2048]).
size mismatch for backbone.stages.3.blocks.0.norm1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.0.norm1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.0.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 24]) from checkpoint, the shape in current model is torch.Size([529, 32]).
size mismatch for backbone.stages.3.blocks.0.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.3.blocks.0.attn.w_msa.qkv.weight: copying a param with shape torch.Size([2304, 768]) from checkpoint, the shape in current model is torch.Size([3072, 1024]).
size mismatch for backbone.stages.3.blocks.0.attn.w_msa.qkv.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for backbone.stages.3.blocks.0.attn.w_msa.proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for backbone.stages.3.blocks.0.attn.w_msa.proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.0.norm2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.0.norm2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.0.ffn.layers.0.0.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for backbone.stages.3.blocks.0.ffn.layers.0.0.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for backbone.stages.3.blocks.0.ffn.layers.1.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for backbone.stages.3.blocks.0.ffn.layers.1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.1.norm1.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.1.norm1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.1.attn.w_msa.relative_position_bias_table: copying a param with shape torch.Size([169, 24]) from checkpoint, the shape in current model is torch.Size([529, 32]).
size mismatch for backbone.stages.3.blocks.1.attn.w_msa.relative_position_index: copying a param with shape torch.Size([49, 49]) from checkpoint, the shape in current model is torch.Size([144, 144]).
size mismatch for backbone.stages.3.blocks.1.attn.w_msa.qkv.weight: copying a param with shape torch.Size([2304, 768]) from checkpoint, the shape in current model is torch.Size([3072, 1024]).
size mismatch for backbone.stages.3.blocks.1.attn.w_msa.qkv.bias: copying a param with shape torch.Size([2304]) from checkpoint, the shape in current model is torch.Size([3072]).
size mismatch for backbone.stages.3.blocks.1.attn.w_msa.proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([1024, 1024]).
size mismatch for backbone.stages.3.blocks.1.attn.w_msa.proj.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.1.norm2.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.1.norm2.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.stages.3.blocks.1.ffn.layers.0.0.weight: copying a param with shape torch.Size([3072, 768]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for backbone.stages.3.blocks.1.ffn.layers.0.0.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for backbone.stages.3.blocks.1.ffn.layers.1.weight: copying a param with shape torch.Size([768, 3072]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for backbone.stages.3.blocks.1.ffn.layers.1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.norm1.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.norm1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for backbone.norm2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.norm2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for backbone.norm3.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for backbone.norm3.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for neck.convs.0.conv.weight: copying a param with shape torch.Size([256, 192, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 1, 1]).
size mismatch for neck.convs.1.conv.weight: copying a param with shape torch.Size([256, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]).
size mismatch for neck.convs.2.conv.weight: copying a param with shape torch.Size([256, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 1, 1]).
size mismatch for neck.extra_convs.0.conv.weight: copying a param with shape torch.Size([256, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1024, 3, 3]).
missing keys in source state_dict: backbone.stages.2.blocks.6.norm1.weight, backbone.stages.2.blocks.6.norm1.bias, backbone.stages.2.blocks.6.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.6.attn.w_msa.relative_position_index, backbone.stages.2.blocks.6.attn.w_msa.qkv.weight, backbone.stages.2.blocks.6.attn.w_msa.qkv.bias, backbone.stages.2.blocks.6.attn.w_msa.proj.weight, backbone.stages.2.blocks.6.attn.w_msa.proj.bias, backbone.stages.2.blocks.6.norm2.weight, backbone.stages.2.blocks.6.norm2.bias, backbone.stages.2.blocks.6.ffn.layers.0.0.weight, backbone.stages.2.blocks.6.ffn.layers.0.0.bias, backbone.stages.2.blocks.6.ffn.layers.1.weight, backbone.stages.2.blocks.6.ffn.layers.1.bias, backbone.stages.2.blocks.7.norm1.weight, backbone.stages.2.blocks.7.norm1.bias, backbone.stages.2.blocks.7.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.7.attn.w_msa.relative_position_index, backbone.stages.2.blocks.7.attn.w_msa.qkv.weight, backbone.stages.2.blocks.7.attn.w_msa.qkv.bias, backbone.stages.2.blocks.7.attn.w_msa.proj.weight, backbone.stages.2.blocks.7.attn.w_msa.proj.bias, backbone.stages.2.blocks.7.norm2.weight, backbone.stages.2.blocks.7.norm2.bias, backbone.stages.2.blocks.7.ffn.layers.0.0.weight, backbone.stages.2.blocks.7.ffn.layers.0.0.bias, backbone.stages.2.blocks.7.ffn.layers.1.weight, backbone.stages.2.blocks.7.ffn.layers.1.bias, backbone.stages.2.blocks.8.norm1.weight, backbone.stages.2.blocks.8.norm1.bias, backbone.stages.2.blocks.8.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.8.attn.w_msa.relative_position_index, backbone.stages.2.blocks.8.attn.w_msa.qkv.weight, backbone.stages.2.blocks.8.attn.w_msa.qkv.bias, backbone.stages.2.blocks.8.attn.w_msa.proj.weight, backbone.stages.2.blocks.8.attn.w_msa.proj.bias, backbone.stages.2.blocks.8.norm2.weight, backbone.stages.2.blocks.8.norm2.bias, backbone.stages.2.blocks.8.ffn.layers.0.0.weight, backbone.stages.2.blocks.8.ffn.layers.0.0.bias, backbone.stages.2.blocks.8.ffn.layers.1.weight, backbone.stages.2.blocks.8.ffn.layers.1.bias, backbone.stages.2.blocks.9.norm1.weight, backbone.stages.2.blocks.9.norm1.bias, backbone.stages.2.blocks.9.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.9.attn.w_msa.relative_position_index, backbone.stages.2.blocks.9.attn.w_msa.qkv.weight, backbone.stages.2.blocks.9.attn.w_msa.qkv.bias, backbone.stages.2.blocks.9.attn.w_msa.proj.weight, backbone.stages.2.blocks.9.attn.w_msa.proj.bias, backbone.stages.2.blocks.9.norm2.weight, backbone.stages.2.blocks.9.norm2.bias, backbone.stages.2.blocks.9.ffn.layers.0.0.weight, backbone.stages.2.blocks.9.ffn.layers.0.0.bias, backbone.stages.2.blocks.9.ffn.layers.1.weight, backbone.stages.2.blocks.9.ffn.layers.1.bias, backbone.stages.2.blocks.10.norm1.weight, backbone.stages.2.blocks.10.norm1.bias, backbone.stages.2.blocks.10.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.10.attn.w_msa.relative_position_index, backbone.stages.2.blocks.10.attn.w_msa.qkv.weight, backbone.stages.2.blocks.10.attn.w_msa.qkv.bias, backbone.stages.2.blocks.10.attn.w_msa.proj.weight, backbone.stages.2.blocks.10.attn.w_msa.proj.bias, backbone.stages.2.blocks.10.norm2.weight, backbone.stages.2.blocks.10.norm2.bias, backbone.stages.2.blocks.10.ffn.layers.0.0.weight, backbone.stages.2.blocks.10.ffn.layers.0.0.bias, backbone.stages.2.blocks.10.ffn.layers.1.weight, backbone.stages.2.blocks.10.ffn.layers.1.bias, backbone.stages.2.blocks.11.norm1.weight, backbone.stages.2.blocks.11.norm1.bias, backbone.stages.2.blocks.11.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.11.attn.w_msa.relative_position_index, backbone.stages.2.blocks.11.attn.w_msa.qkv.weight, backbone.stages.2.blocks.11.attn.w_msa.qkv.bias, backbone.stages.2.blocks.11.attn.w_msa.proj.weight, backbone.stages.2.blocks.11.attn.w_msa.proj.bias, backbone.stages.2.blocks.11.norm2.weight, backbone.stages.2.blocks.11.norm2.bias, backbone.stages.2.blocks.11.ffn.layers.0.0.weight, backbone.stages.2.blocks.11.ffn.layers.0.0.bias, backbone.stages.2.blocks.11.ffn.layers.1.weight, backbone.stages.2.blocks.11.ffn.layers.1.bias, backbone.stages.2.blocks.12.norm1.weight, backbone.stages.2.blocks.12.norm1.bias, backbone.stages.2.blocks.12.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.12.attn.w_msa.relative_position_index, backbone.stages.2.blocks.12.attn.w_msa.qkv.weight, backbone.stages.2.blocks.12.attn.w_msa.qkv.bias, backbone.stages.2.blocks.12.attn.w_msa.proj.weight, backbone.stages.2.blocks.12.attn.w_msa.proj.bias, backbone.stages.2.blocks.12.norm2.weight, backbone.stages.2.blocks.12.norm2.bias, backbone.stages.2.blocks.12.ffn.layers.0.0.weight, backbone.stages.2.blocks.12.ffn.layers.0.0.bias, backbone.stages.2.blocks.12.ffn.layers.1.weight, backbone.stages.2.blocks.12.ffn.layers.1.bias, backbone.stages.2.blocks.13.norm1.weight, backbone.stages.2.blocks.13.norm1.bias, backbone.stages.2.blocks.13.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.13.attn.w_msa.relative_position_index, backbone.stages.2.blocks.13.attn.w_msa.qkv.weight, backbone.stages.2.blocks.13.attn.w_msa.qkv.bias, backbone.stages.2.blocks.13.attn.w_msa.proj.weight, backbone.stages.2.blocks.13.attn.w_msa.proj.bias, backbone.stages.2.blocks.13.norm2.weight, backbone.stages.2.blocks.13.norm2.bias, backbone.stages.2.blocks.13.ffn.layers.0.0.weight, backbone.stages.2.blocks.13.ffn.layers.0.0.bias, backbone.stages.2.blocks.13.ffn.layers.1.weight, backbone.stages.2.blocks.13.ffn.layers.1.bias, backbone.stages.2.blocks.14.norm1.weight, backbone.stages.2.blocks.14.norm1.bias, backbone.stages.2.blocks.14.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.14.attn.w_msa.relative_position_index, backbone.stages.2.blocks.14.attn.w_msa.qkv.weight, backbone.stages.2.blocks.14.attn.w_msa.qkv.bias, backbone.stages.2.blocks.14.attn.w_msa.proj.weight, backbone.stages.2.blocks.14.attn.w_msa.proj.bias, backbone.stages.2.blocks.14.norm2.weight, backbone.stages.2.blocks.14.norm2.bias, backbone.stages.2.blocks.14.ffn.layers.0.0.weight, backbone.stages.2.blocks.14.ffn.layers.0.0.bias, backbone.stages.2.blocks.14.ffn.layers.1.weight, backbone.stages.2.blocks.14.ffn.layers.1.bias, backbone.stages.2.blocks.15.norm1.weight, backbone.stages.2.blocks.15.norm1.bias, backbone.stages.2.blocks.15.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.15.attn.w_msa.relative_position_index, backbone.stages.2.blocks.15.attn.w_msa.qkv.weight, backbone.stages.2.blocks.15.attn.w_msa.qkv.bias, backbone.stages.2.blocks.15.attn.w_msa.proj.weight, backbone.stages.2.blocks.15.attn.w_msa.proj.bias, backbone.stages.2.blocks.15.norm2.weight, backbone.stages.2.blocks.15.norm2.bias, backbone.stages.2.blocks.15.ffn.layers.0.0.weight, backbone.stages.2.blocks.15.ffn.layers.0.0.bias, backbone.stages.2.blocks.15.ffn.layers.1.weight, backbone.stages.2.blocks.15.ffn.layers.1.bias, backbone.stages.2.blocks.16.norm1.weight, backbone.stages.2.blocks.16.norm1.bias, backbone.stages.2.blocks.16.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.16.attn.w_msa.relative_position_index, backbone.stages.2.blocks.16.attn.w_msa.qkv.weight, backbone.stages.2.blocks.16.attn.w_msa.qkv.bias, backbone.stages.2.blocks.16.attn.w_msa.proj.weight, backbone.stages.2.blocks.16.attn.w_msa.proj.bias, backbone.stages.2.blocks.16.norm2.weight, backbone.stages.2.blocks.16.norm2.bias, backbone.stages.2.blocks.16.ffn.layers.0.0.weight, backbone.stages.2.blocks.16.ffn.layers.0.0.bias, backbone.stages.2.blocks.16.ffn.layers.1.weight, backbone.stages.2.blocks.16.ffn.layers.1.bias, backbone.stages.2.blocks.17.norm1.weight, backbone.stages.2.blocks.17.norm1.bias, backbone.stages.2.blocks.17.attn.w_msa.relative_position_bias_table, backbone.stages.2.blocks.17.attn.w_msa.relative_position_index, backbone.stages.2.blocks.17.attn.w_msa.qkv.weight, backbone.stages.2.blocks.17.attn.w_msa.qkv.bias, backbone.stages.2.blocks.17.attn.w_msa.proj.weight, backbone.stages.2.blocks.17.attn.w_msa.proj.bias, backbone.stages.2.blocks.17.norm2.weight, backbone.stages.2.blocks.17.norm2.bias, backbone.stages.2.blocks.17.ffn.layers.0.0.weight, backbone.stages.2.blocks.17.ffn.layers.0.0.bias, backbone.stages.2.blocks.17.ffn.layers.1.weight, backbone.stages.2.blocks.17.ffn.layers.1.bias

01/09 03:06:13 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
01/09 03:06:13 - mmengine - INFO - Export PyTorch model to ONNX: mmdeploy_model/groundingdino/end2end.onnx.
01/09 03:06:13 - mmengine - WARNING - Can not find torch.nn.functional._scaled_dot_product_attention, function rewrite will not be applied
01/09 03:06:13 - mmengine - WARNING - Can not find mmdet.models.utils.transformer.PatchMerging.forward, function rewrite will not be applied
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
ys_shape = tuple(int(s) for s in ys.shape)
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/layers/transformer/utils.py:167: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
output_h = math.ceil(input_h / stride_h)
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/layers/transformer/utils.py:168: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
output_w = math.ceil(input_w / stride_w)
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/layers/transformer/utils.py:169: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
pad_h = max((output_h - 1) * stride_h +
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/layers/transformer/utils.py:171: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
pad_w = max((output_w - 1) * stride_w +
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/layers/transformer/utils.py:177: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_h > 0 or pad_w > 0:
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/codebase/mmdet/models/backbones.py:189: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert L == H * W, 'input feature has wrong size'
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/codebase/mmdet/models/backbones.py:147: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
B = int(windows.shape[0] / (H * W / window_size / window_size))
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmcv/cnn/bricks/wrappers.py:167: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 5)):
/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/layers/transformer/utils.py:414: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
assert L == H * W, 'input feature has wrong size'
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

Process Process-2:
Traceback (most recent call last):
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in call
ret = func(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/pytorch2onnx.py", line 98, in torch2onnx
export(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 356, in wrap
return self.call_function(func_name, *args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
return self.call_function_local(func_name, *args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
return pipe_caller(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/core/pipeline_manager.py", line 107, in call
ret = func(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/onnx/export.py", line 138, in export
torch.onnx.export(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/onnx/utils.py", line 506, in export
_export(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/onnx/utils.py", line 1548, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/onnx/optimizer.py", line 27, in model_to_graph__custom_optimizer
graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/onnx/utils.py", line 1113, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/onnx/utils.py", line 989, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/onnx/utils.py", line 893, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/jit/_trace.py", line 1268, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/onnx/export.py", line 123, in wrapper
return forward(*arg, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/codebase/mmdet/models/detectors/base_detr.py", line 89, in detection_transformer__forward
return __predict_impl(self, batch_inputs, data_samples, rescale)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/core/optimizers/function_marker.py", line 266, in g
rets = f(*args, **kwargs)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/codebase/mmdet/models/detectors/base_detr.py", line 22, in __predict_impl
head_inputs_dict = self.forward_transformer(img_feats, data_samples)
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/detectors/grounding_dino.py", line 303, in forward_transformer
encoder_inputs_dict, decoder_inputs_dict = self.pre_transformer(
File "/home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdet/models/detectors/deformable_detr.py", line 152, in pre_transformer
assert batch_data_samples is not None
AssertionError
01/09 03:06:19 - mmengine - ERROR - /home/bowen/anaconda3/envs/yolo8/lib/python3.8/site-packages/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - mmdeploy.apis.pytorch2onnx.torch2onnx with Call id: 0 failed. exit.

Baboom-l · 2024-01-09T08:52:44Z

我用的mmdeploy，静态和动态在配置文件里写好就行。

类似这样
onnx_config = dict(
type='onnx',
export_params=True,
keep_initializers_as_inputs=False,
opset_version=11,
save_file='end2end.onnx',
input_names=['image','embedded','masks','position_ids','text_token_mask','positive_maps'],
output_names=['boxes', 'logits'],
input_shape=None,
dynamic_axes={
'embedded':{0: 'batch', 1: 'num_tonkens'},
'masks':{0: 'batch', 1: 'num_tonkens', 2: 'num_tonkens'},
'position_ids':{0: 'batch', 1: 'num_tonkens'},
'text_token_mask':{0: 'batch', 1: 'num_tonkens'},
'positive_maps':{0: 'batch', 1: 'class_nums', 2: 'token_maps'},

    'boxes': {
        0: 'batch',
        1: 'num_querys',
    },
    'logits': {
        0: 'batch',
        1: 'num_querys',
    },
},
optimize=True)

codebase_config = dict(
type='mmdet',
task='ObjectDetection',
model_type='end2end',
post_processing=dict(
score_threshold=0.05,
confidence_threshold=0.005, # for YOLOv3
iou_threshold=0.5,
max_output_boxes_per_class=200,
pre_top_k=5000,
keep_top_k=100,
background_label_id=-1,
))

backend_config = dict(
type='tensorrt', common_config=dict(fp16_mode=True, max_workspace_size=1<<31),
model_inputs = [
dict(
input_shapes=dict(
embedded=dict(
min_shape = [1,2,768],
opt_shape = [1,10,768],
max_shape = [1,256,768],
),
masks=dict(
min_shape = [1,2,2],
opt_shape = [1,10,10],
max_shape = [1,256,256]
),
hidden=dict(
min_shape = [1,2,768],
opt_shape = [1,10,768],
max_shape = [1,256,768],
),
position_ids=dict(
min_shape = [1,2],
opt_shape = [1,10],
max_shape = [1,256],
),
text_token_mask=dict(
min_shape = [1,2],
opt_shape = [1,10],
max_shape = [1,256],
),
positive_maps=dict(
min_shape = [1,2,256],
opt_shape = [1,10,256],
max_shape = [1,256,256]
)
)
)
]
)

Baboom-l · 2024-01-09T08:53:41Z

但转换GD需要在mmdeploy库中重写一些函数

Baboom-l · 2024-01-09T08:55:41Z

BaseBackendModel, torch2onnx两个类都需要做略微的修改

Baboom-l · 2024-01-09T08:59:15Z

tokenizer是无法转onnx的，必须拆出来

Baboom-l · 2024-01-09T09:02:37Z

@wxz1996 老哥你转的TensorRT 40ms 精度咋样，我这边转TensoRT SwinB fp32 在A100上差不多110ms，精度对的上，但fp16和int8精度都对不上

xiyangyang99 · 2024-01-10T02:28:47Z

兄台你可以写个博客。我给你充值。

…

---原始邮件--- 发件人: "Chen ***@***.***> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

wxz1996 · 2024-01-10T02:29:26Z

兄台你可以写个博客。我给你充值。
…
---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

+1

shuchang0714 · 2024-01-11T07:36:17Z

兄台你可以写个博客。我给你充值。
…
---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。

环境
tensorrt 8.6.1.6 cuda 11.7
torch转onnx

reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py
动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始
dynamic_axes={
"input_ids": {0: "batch_size", 1: "seq_len"},
"attention_mask": {0: "batch_size", 1: "seq_len"},
"position_ids": {0: "batch_size", 1: "seq_len"},
"token_type_ids": {0: "batch_size", 1: "seq_len"},
"text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"},
"img": {0: "batch_size", 2: "height", 3: "width"},
"logits": {0: "batch_size"},
"boxes": {0: "batch_size"}
}
opset_version:16

onnx转tensorrt

./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25

tensorrt推理
inference_trt.zip

xiyangyang99 · 2024-01-11T07:40:01Z

groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月11日(星期四) 下午3:36 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

shuchang0714 · 2024-01-11T07:45:00Z

groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.>

是的，把text那部分拆出来单独写逻辑输出给transformer

xiyangyang99 · 2024-01-11T07:56:54Z

我正在用你的命令行在转trt，在3090上转tensorrt中，还在转，转出来结果我等会儿告诉你。谢谢。

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月11日(星期四) 下午3:45 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

xiyangyang99 · 2024-01-11T08:02:25Z

我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead.   context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead.   size = abs(trt.volume(context.get_binding_shape(i))) * bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead.   dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead.   if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead.   context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp   return 1/(1 + np.exp(-x)) Traceback (most recent call last):   File "trt_inference_on_a_image.py", line 274, in <module>     boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None)   File "trt_inference_on_a_image.py", line 143, in outputs_postprocess     pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer)   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx]   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp>     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月11日(星期四) 下午3:45 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

shuchang0714 · 2024-01-11T08:31:22Z

我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**>

应该是你input_ids的长度没对齐吧，导出onnx的代码看看。推理的代码没改是吧

xiyangyang99 · 2024-01-11T09:24:49Z

我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) _ bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 _@**._>; _@.>;"State @._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: _@.>

应该是你input_ids的长度没对齐吧，导出onnx的代码看看。推理的代码没改是吧

我的input_ids,在export onnx的时候就是红色标记中的文本。输入是动态尺寸的，在min和max之间。推理代码中修改了自己的图像和文本。

xiyangyang99 · 2024-01-11T09:29:38Z

export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog . 

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月11日(星期四) 下午4:31 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead.   context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead.   size = abs(trt.volume(context.get_binding_shape(i)))  bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead.   dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead.   if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead.   context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp   return 1/(1 + np.exp(-x)) Traceback (most recent call last):   File "trt_inference_on_a_image.py", line 274, in <module>     boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None)   File "trt_inference_on_a_image.py", line 143, in outputs_postprocess     pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer)   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx]   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp>     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

shuchang0714 · 2024-01-11T09:33:05Z

export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog .
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.*>

代码打包来看看，有空帮你跑下

xiyangyang99 · 2024-01-11T09:55:43Z

这是我的导出脚本export_openvino.py，脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中，模型是mmdetection微调自己数据集之后的pth，caption="pressure gauge ." (自己数据集的text)，config脚本就是GroundingDINO_SwinT_OGC.py。

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年1月11日(星期四) 下午5:33 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog .  … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead.   context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead.   size = abs(trt.volume(context.get_binding_shape(i)))  bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead.   dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead.   if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead.   context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp   return 1/(1 + np.exp(-x)) Traceback (most recent call last):   File "trt_inference_on_a_image.py", line 274, in <module>     boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None)   File "trt_inference_on_a_image.py", line 143, in outputs_postprocess     pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer)   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx]   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp>     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.*> 代码打包来看看，有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22

xiyangyang99 · 2024-01-12T03:28:52Z

export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog .
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 _@**._>; _@.>;"State @._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>

代码打包来看看，有空帮你跑下

GroundingDINO的模型是文本和图像同时输入的，有交叉注意力机制的计算。tokenzier拆出来不做加速只加速transformer那部分，tokenzier后加到后处理中感觉会影响结果。

formance · 2024-02-23T02:04:20Z

这是我的导出脚本export_openvino.py，脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中，模型是mmdetection微调自己数据集之后的pth，caption="pressure gauge ." (自己数据集的text)，config脚本就是GroundingDINO_SwinT_OGC.py。
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午5:33 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog . … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 代码打包来看看，有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22
兄弟，mmdet微调的grounding dino模型转onnx 你成功了吗，grounding dino官方的模型转onnx和tensorrt推理我都跑通了的，但是mmdet训练的还没有成功。

xiyangyang99 · 2024-02-23T02:47:01Z

你的onnx转tensorrt是怎么转的？用的工具嘛？我这边是可以微调自己的数据集的。

…

------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年2月23日(星期五) 上午10:04 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这是我的导出脚本export_openvino.py，脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中，模型是mmdetection微调自己数据集之后的pth，caption="pressure gauge ." (自己数据集的text)，config脚本就是GroundingDINO_SwinT_OGC.py。 … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午5:33 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog .  … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead.   context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead.   size = abs(trt.volume(context.get_binding_shape(i)))  bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead.   dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead.   if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead.   context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp   return 1/(1 + np.exp(-x)) Traceback (most recent call last):   File "trt_inference_on_a_image.py", line 274, in <module>     boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None)   File "trt_inference_on_a_image.py", line 143, in outputs_postprocess     pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer)   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx]   File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp>     token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 代码打包来看看，有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22 兄弟，mmdet微调的grounding dino模型转onnx 你成功了吗，grounding dino官方的模型转onnx和tensorrt推理我都跑通了的，但是mmdet训练的还没有成功。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***>

formance · 2024-02-23T06:00:47Z

你的onnx转tensorrt是怎么转的？用的工具嘛？我这边是可以微调自己的数据集的。
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年2月23日(星期五) 上午10:04 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这是我的导出脚本export_openvino.py，脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中，模型是mmdetection微调自己数据集之后的pth，caption="pressure gauge ." (自己数据集的text)，config脚本就是GroundingDINO_SwinT_OGC.py。 … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午5:33 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog . … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 代码打包来看看，有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22 兄弟，mmdet微调的grounding dino模型转onnx 你成功了吗，grounding dino官方的模型转onnx和tensorrt推理我都跑通了的，但是mmdet训练的还没有成功。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.*>

用的tensorrt python库，不过我用的官方没有微调过的模型，mmdet训练的groundingdino模型怎么转onnx啊

xiyangyang99 · 2024-03-12T07:14:22Z

你的onnx转tensorrt是怎么转的？用的工具嘛？我这边是可以微调自己的数据集的。
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年2月23日(星期五) 上午10:04 _@**._>; _@.>;"State @._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这是我的导出脚本export_openvino.py，脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中，模型是mmdetection微调自己数据集之后的pth，caption="pressure gauge ." (自己数据集的text)，config脚本就是GroundingDINO_SwinT_OGC.py。 … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午5:33 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候，这个input_ids没有改变。推理的时候也是the runing dog . … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx，onnx2tensorrt的用的命令也是你给的那个命令，转出来的trt推理的时候报错了，报错的日志如下： python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range‍ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分，但是tokenzier无法转onnx的话，后面怎么进行tensorrt推理？或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer？ … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去，但tokennizer不能放进去，我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐，swint在a100上推理速度170ms左右，比torch推理快30%，但是fp16精度对齐不了，现在在用polygraphy debug中，有进展可以一起讨论下。环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference：https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入：每一个输入和输出中动态的维度都要注明，下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的，把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧，导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 代码打包来看看，有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22 兄弟，mmdet微调的grounding dino模型转onnx 你成功了吗，grounding dino官方的模型转onnx和tensorrt推理我都跑通了的，但是mmdet训练的还没有成功。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>

用的tensorrt python库，不过我用的官方没有微调过的模型，mmdet训练的groundingdino模型怎么转onnx啊

你的tensorrt版本是多少？

blacksino · 2024-03-18T07:41:42Z

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

formance · 2024-03-19T01:24:06Z

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

blacksino · 2024-03-19T01:49:10Z

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

老哥可以给个google drive link或者发我邮箱么：469915440@qq.com，麻烦老哥了

formance · 2024-03-19T03:30:50Z

大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。

太大了，传不上来。。

老哥可以给个google drive link或者发我邮箱么：469915440@qq.com，麻烦老哥了

https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成

xiyangyang99 · 2024-03-19T03:44:29Z

ok ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" ***@***.***>; 发送时间: 2024年3月19日(星期二) 中午11:31 ***@***.***>; ***@***.***>;"State ***@***.***>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。太大了，传不上来。。老哥可以给个google drive ***@***.***，麻烦老哥了 https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: ***@***.***> 从QQ邮箱发来的超大附件 grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3

blacksino · 2024-03-20T07:11:38Z

感谢，请问这是基于swinb的么，还是swin T？

ok ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年3月19日(星期二) 中午11:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。太大了，传不上来。。老哥可以给个google drive @.，麻烦老哥了 https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 从QQ邮箱发来的超大附件 grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3

感谢，请问这是基于swinb的么，还是swin T？

formance · 2024-03-20T08:16:33Z

感谢，请问这是基于swinb的么，还是swin T？

ok ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年3月19日(星期二) 中午11:31 _@**._>; _@.>;"State @._>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 大佬，请问可以给一份转出来的onnx模型么，我自己转的貌似精度很原生torch有很大差异，方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。太大了，传不上来。。老哥可以给个google drive _@.，麻烦老哥了 https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了，tokelizer预处理单独完成 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**_> 从QQ邮箱发来的超大附件 grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面：https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3

感谢，请问这是基于swinb的么，还是swin T？

swin T

Di-Gu · 2024-06-11T09:58:27Z

BaseBackendModel, torch2onnx两个类都需要做略微的修改

想问一下，mmdetction里面需要改么？

firework-github · 2024-06-14T09:34:26Z

@wxz1996 老哥你转的TensorRT 40ms 精度咋样，我这边转TensoRT SwinB fp32 在A100上差不多110ms，精度对的上，但fp16和int8精度都对不上

FP32的时候，swinB转成onnx去推理，能和pytorch对齐输出；但是onnx转成trt，推理结果很多异常值，想问下需要改哪些地方呢

levylll · 2024-07-03T13:16:22Z

@wxz1996 老哥你转的TensorRT 40ms 精度咋样，我这边转TensoRT SwinB fp32 在A100上差不多110ms，精度对的上，但fp16和int8精度都对不上

FP32的时候，swinB转成onnx去推理，能和pytorch对齐输出；但是onnx转成trt，推理结果很多异常值，想问下需要改哪些地方呢

请问这个有什么进展了吗？

mm-assistant bot assigned RangiLyu Jan 4, 2024

xiyangyang99 closed this as completed Jan 11, 2024

xiyangyang99 reopened this Jan 11, 2024

hhaAndroid mentioned this issue Jan 11, 2024

[New Feature] MM Grounding DINO #11313

Open

关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 #11342

关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 #11342

Comments

xiyangyang99 commented Jan 4, 2024

hhaAndroid commented Jan 4, 2024

xiyangyang99 commented Jan 4, 2024

hhaAndroid commented Jan 4, 2024

wxz1996 commented Jan 4, 2024

xiyangyang99 commented Jan 4, 2024 via email

wxz1996 commented Jan 4, 2024

xiyangyang99 commented Jan 4, 2024 via email

wxz1996 commented Jan 5, 2024

xiyangyang99 commented Jan 5, 2024 via email

wxz1996 commented Jan 5, 2024

xiyangyang99 commented Jan 6, 2024

xiyangyang99 commented Jan 6, 2024

xiyangyang99 commented Jan 6, 2024

wxz1996 commented Jan 6, 2024

xiyangyang99 commented Jan 6, 2024

wxz1996 commented Jan 6, 2024 • edited Loading

xiyangyang99 commented Jan 7, 2024 via email

xiyangyang99 commented Jan 8, 2024

xiyangyang99 commented Jan 8, 2024

wxz1996 commented Jan 8, 2024

xiyangyang99 commented Jan 8, 2024 via email

xiyangyang99 commented Jan 8, 2024 via email

Baboom-l commented Jan 9, 2024

xiyangyang99 commented Jan 9, 2024

Baboom-l commented Jan 9, 2024

Baboom-l commented Jan 9, 2024

Baboom-l commented Jan 9, 2024

Baboom-l commented Jan 9, 2024

Baboom-l commented Jan 9, 2024

xiyangyang99 commented Jan 10, 2024 via email

wxz1996 commented Jan 10, 2024

shuchang0714 commented Jan 11, 2024

xiyangyang99 commented Jan 11, 2024 via email

shuchang0714 commented Jan 11, 2024

xiyangyang99 commented Jan 11, 2024 via email

xiyangyang99 commented Jan 11, 2024 via email

shuchang0714 commented Jan 11, 2024 • edited Loading

xiyangyang99 commented Jan 11, 2024

xiyangyang99 commented Jan 11, 2024 via email

shuchang0714 commented Jan 11, 2024

xiyangyang99 commented Jan 11, 2024 via email

xiyangyang99 commented Jan 12, 2024

formance commented Feb 23, 2024

xiyangyang99 commented Feb 23, 2024 via email

formance commented Feb 23, 2024

xiyangyang99 commented Mar 12, 2024

blacksino commented Mar 18, 2024

formance commented Mar 19, 2024

blacksino commented Mar 19, 2024

formance commented Mar 19, 2024

xiyangyang99 commented Mar 19, 2024 via email

blacksino commented Mar 20, 2024

formance commented Mar 20, 2024

Di-Gu commented Jun 11, 2024

firework-github commented Jun 14, 2024

levylll commented Jul 3, 2024

wxz1996 commented Jan 6, 2024 •

edited

Loading

shuchang0714 commented Jan 11, 2024 •

edited

Loading