-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 #11342
Comments
这个需要 mmdeploy 支持,他们目前初步计划是 1月份,但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr |
我现在手里有GroundingDINO转tensorrt成功的例子,在RTX3090上推理时间40ms,但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗?我们团队正在做groundingdino的tensorrt加速适配。 |
如果你们有兴趣来支持,那非常好呀,你们来主导支持,然后碰到问题和 mmdeploy 人员进行沟通。 |
请问能借鉴下GroundingDINO onnx转trt的python脚本吗,我这边一直报错 |
我是用trtexec转的。不是脚本转的。
…---原始邮件---
发件人: ***@***.***>
发送时间: 2024年1月4日(周四) 晚上8:02
收件人: ***@***.***>;
抄送: ***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
这个需要 mmdeploy 支持,他们目前初步计划是 1月份,但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr
我现在手里有GroundingDINO转tensorrt成功的例子,在RTX3090上推理时间40ms,但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗?我们团队正在做groundingdino的tensorrt加速适配。
请问能借鉴下GroundingDINO onnx转trt的python脚本吗,我这边一直报错
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
你的pytorch转onnx是将整个grounding dino全部转了吗 |
是的
…---原始邮件---
发件人: ***@***.***>
发送时间: 2024年1月4日(周四) 晚上9:45
收件人: ***@***.***>;
抄送: ***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
我是用trtexec转的。不是脚本转的。
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持,他们目前初步计划是 1月份,但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子,在RTX3090上推理时间40ms,但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗?我们团队正在做groundingdino的tensorrt加速适配。 请问能借鉴下GroundingDINO onnx转trt的python脚本吗,我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
你的pytorch转onnx是将整个grounding dino全部转了吗
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
我这边转出来还是有些问题,不知道是不是onnx转错了,我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是: |
你的环境是什么?我用的tensorrt版本是8.6.1.6。CUDA11 .7
…---原始邮件---
发件人: ***@***.***>
发送时间: 2024年1月5日(周五) 中午11:54
收件人: ***@***.***>;
抄送: ***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
是的
…
---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上9:45 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我是用trtexec转的。不是脚本转的。 … ---原始邮件--- 发件人: @.> 发送时间: 2024年1月4日(周四) 晚上8:02 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 这个需要 mmdeploy 支持,他们目前初步计划是 1月份,但是不是很确定 1 月能不能做完。如果社区有愿意支持那就太好了。可以给 mmdet 提 pr 我现在手里有GroundingDINO转tensorrt成功的例子,在RTX3090上推理时间40ms,但是目前好像mmdetection的GroundingDINO的onnx无法转为engine。官方的GroundingDINO的模型可以转为engine。可以让我们假如mmdeploy的联合开发者吗?我们团队正在做groundingdino的tensorrt加速适配。 请问能借鉴下GroundingDINO onnx转trt的python脚本吗,我这边一直报错 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> 你的pytorch转onnx是将整个grounding dino全部转了吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.>
我这边转出来还是有些问题,不知道是不是onnx转错了,我是按照https://github.com/wenyi5608/GroundingDINO.git这个库来测试的。onnx到trt用的是:
trtexec --onnx=./weights/groundingdino_swint_ogc.onnx \ --saveEngine=./weights/groundingdino_swint_ogc.trt --best --workspace=1024
大佬能给个转onnx的脚本吗
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
我也是8.6.1.6,cuda是12.3,你转trt的命令是? |
|
engine我倒是转出来了,但是用python写推理脚本的时候,engine的反序列化出bug了。还在看怎么推理engine文件。 |
|
我直接转出来跟onnx的输出无法对齐,你这个可以吗 |
你有python端的推理脚本嘛?用的动态输入还是静态输入?我这边推理代码还没写完。所以还不得而知。 |
@QzYER |
我先看看。
…---原始邮件---
发件人: ***@***.***>
发送时间: 2024年1月6日(周六) 晚上10:11
收件人: ***@***.***>;
抄送: ***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
export_trt.zip
这是我的推理脚本,里面有onnx和trt,我是静态输入,动态的搞不定,目前fp32的trt跟onnx对不齐
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
你测试了fp16没有是否对其?我这边刚刚测试了一下,也无法对齐精度问题。你测试的时候输入的不同文本,其中input_ids |
|
都是对应的,你看我的脚本里面,onnx和trt输入完全一样,fp32/fp16都是对不齐 |
我看了你的推理代码,我的也一样,输出无法对齐。
这是我的输出情况。
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月8日(星期一) 下午2:36
***@***.***>;
***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
@QzYER export_trt.zip 这是我的推理脚本,里面有onnx和trt,我是静态输入,动态的搞不定,目前fp32的trt跟onnx对不齐
你测试了fp16没有是否对其?我这边刚刚测试了一下,也无法对齐精度问题。你测试的时候输入的不同文本,其中input_ids 、attension_mask、 position_ids 等这几个输入参数是否对应?
都是对应的,你看我的脚本里面,onnx和trt输入完全一样,fp32/fp16都是对不齐
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
如果pytorchoronnx 无法和tensorrt对齐的话,只有通过pytorch在某一层的输出和onnx相对应 的某节点输出的tensor做对比,来先定位是哪个部分精度无法对齐。然后再找算子去替换掉。我已经反馈给tensorrt官方了。看他们怎么看。
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月8日(星期一) 下午2:36
***@***.***>;
***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
@QzYER export_trt.zip 这是我的推理脚本,里面有onnx和trt,我是静态输入,动态的搞不定,目前fp32的trt跟onnx对不齐
你测试了fp16没有是否对其?我这边刚刚测试了一下,也无法对齐精度问题。你测试的时候输入的不同文本,其中input_ids 、attension_mask、 position_ids 等这几个输入参数是否对应?
都是对应的,你看我的脚本里面,onnx和trt输入完全一样,fp32/fp16都是对不齐
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
使用mmdeploy导出需要固定图像尺寸,text可以动态,这样FP32的Grounding DINO精度可以对齐 |
这是我的转换过程。请问兄台是如何固定输入图像尺寸,text设为动态输入。把groundingdino转出来的呀? (yolo8) bowen@bowen-MS-7D20:/media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy$ python /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/tools/deploy.py /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/configs/mmdet/detection/detection_tensorrt-fp16_dynamic-320x320-1344x1344.py /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/mmdetection/configs/grounding_dino/grounding_dino_swin-b_finetune_16xb2_1x_coco.py /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/grounding_dino_deploy/weights.pth /media/bowen/6202c499-4f0a-4280-af7e-d2ab4b6c74dd/home/bowen/mmdeploy/mmdetection/demo/demo.jpg --work-dir mmdeploy_model/groundingdino --device cuda --dump-info size mismatch for backbone.patch_embed.projection.weight: copying a param with shape torch.Size([96, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([128, 3, 4, 4]). 01/09 03:06:13 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. Process Process-2: |
我用的mmdeploy,静态和动态在配置文件里写好就行。 类似这样
codebase_config = dict( backend_config = dict( |
但转换GD需要在mmdeploy库中重写一些函数 |
BaseBackendModel, torch2onnx两个类都需要做略微的修改 |
tokenizer是无法转onnx的,必须拆出来 |
@wxz1996 老哥你转的TensorRT 40ms 精度咋样,我这边转TensoRT SwinB fp32 在A100上差不多110ms,精度对的上,但fp16和int8精度都对不上 |
兄台你可以写个博客。我给你充值。
…---原始邮件---
发件人: "Chen ***@***.***>
发送时间: 2024年1月10日(周三) 上午10:03
收件人: ***@***.***>;
抄送: ***@***.******@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
+1 |
我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。
reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py
./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25
|
groundingdino中有文本部分,但是tokenzier无法转onnx的话,后面怎么进行tensorrt推理?或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer?
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月11日(星期四) 下午3:36
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
兄台你可以写个博客。我给你充值。
…
---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。
环境
tensorrt 8.6.1.6 cuda 11.7
torch转onnx
reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py
动态输入:每一个输入和输出中动态的维度都要注明,下标从0开始
dynamic_axes={
"input_ids": {0: "batch_size", 1: "seq_len"},
"attention_mask": {0: "batch_size", 1: "seq_len"},
"position_ids": {0: "batch_size", 1: "seq_len"},
"token_type_ids": {0: "batch_size", 1: "seq_len"},
"text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"},
"img": {0: "batch_size", 2: "height", 3: "width"},
"logits": {0: "batch_size"},
"boxes": {0: "batch_size"}
}
opset_version:16
onnx转tensorrt
./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25
tensorrt推理
inference_trt.zip
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
是的,把text那部分拆出来单独写逻辑输出给transformer |
我正在用你的命令行在转trt,在3090上转tensorrt中,还在转,转出来结果我等会儿告诉你。谢谢。
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月11日(星期四) 下午3:45
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
groundingdino中有文本部分,但是tokenzier无法转onnx的话,后面怎么进行tensorrt推理?或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer?
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。 环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入:每一个输入和输出中动态的维度都要注明,下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.>
是的,把text那部分拆出来单独写逻辑输出给transformer
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
我用的onnx也是你分享给我的那个脚本转出来的onnx,onnx2tensorrt的用的命令也是你给的那个命令,转出来的trt推理的时候报错了,报错的日志如下:
python trt_inference_on_a_image.py
[01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1
[01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1
[01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead.
context.set_binding_shape(i, opt)
trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead.
size = abs(trt.volume(context.get_binding_shape(i))) * bs
trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead.
dtype = trt.nptype(engine.get_binding_dtype(binding))
trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead.
if engine.binding_is_input(binding):
trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead.
context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle)
[01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead.
[01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead.
trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp
return 1/(1 + np.exp(-x))
Traceback (most recent call last):
File "trt_inference_on_a_image.py", line 274, in <module>
boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None)
File "trt_inference_on_a_image.py", line 143, in outputs_postprocess
pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer)
File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap
token_ids = [tokenized["input_ids"][i] for i in non_zero_idx]
File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp>
token_ids = [tokenized["input_ids"][i] for i in non_zero_idx]
IndexError: list index out of range
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月11日(星期四) 下午3:45
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
groundingdino中有文本部分,但是tokenzier无法转onnx的话,后面怎么进行tensorrt推理?或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer?
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。 环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入:每一个输入和输出中动态的维度都要注明,下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.>
是的,把text那部分拆出来单独写逻辑输出给transformer
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
应该是你input_ids的长度没对齐吧,导出onnx的代码看看。推理的代码没改是吧 |
我的input_ids,在export onnx的时候就是红色标记中的文本。输入是动态尺寸的,在min和max之间。推理代码中修改了自己的图像和文本。 |
export成onnx的时候,这个input_ids没有改变。推理的时候也是the runing dog .
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月11日(星期四) 下午4:31
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
我用的onnx也是你分享给我的那个脚本转出来的onnx,onnx2tensorrt的用的命令也是你给的那个命令,转出来的trt推理的时候报错了,报错的日志如下: python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分,但是tokenzier无法转onnx的话,后面怎么进行tensorrt推理?或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer? … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。 环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入:每一个输入和输出中动态的维度都要注明,下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的,把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**>
应该是你input_ids的长度没对齐吧,导出onnx的代码看看
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
代码打包来看看,有空帮你跑下 |
这是我的导出脚本export_openvino.py,脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中,模型是mmdetection微调自己数据集之后的pth,caption="pressure gauge ." (自己数据集的text),config脚本就是GroundingDINO_SwinT_OGC.py。
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年1月11日(星期四) 下午5:33
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
export成onnx的时候,这个input_ids没有改变。推理的时候也是the runing dog .
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx,onnx2tensorrt的用的命令也是你给的那个命令,转出来的trt推理的时候报错了,报错的日志如下: python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分,但是tokenzier无法转onnx的话,后面怎么进行tensorrt推理?或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer? … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。 环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入:每一个输入和输出中动态的维度都要注明,下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的,把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧,导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.*>
代码打包来看看,有空帮你跑下
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
从QQ邮箱发来的超大附件
weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面:https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22
|
GroundingDINO的模型是文本和图像同时输入的,有交叉注意力机制的计算。tokenzier拆出来不做加速只加速transformer那部分,tokenzier后加到后处理中感觉会影响结果。 |
|
你的onnx转tensorrt是怎么转的?用的工具嘛?我这边是可以微调自己的数据集的。
…------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年2月23日(星期五) 上午10:04
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
这是我的导出脚本export_openvino.py,脚本中加载模型那里改为了model.load_state_dict(clean_state_dict(checkpoint))。pytorch模型在附件中,模型是mmdetection微调自己数据集之后的pth,caption="pressure gauge ." (自己数据集的text),config脚本就是GroundingDINO_SwinT_OGC.py。
…
------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午5:33 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) export成onnx的时候,这个input_ids没有改变。推理的时候也是the runing dog . … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午4:31 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 我用的onnx也是你分享给我的那个脚本转出来的onnx,onnx2tensorrt的用的命令也是你给的那个命令,转出来的trt推理的时候报错了,报错的日志如下: python trt_inference_on_a_image.py [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:50] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.4.1 [01/11/2024-15:58:51] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading trt_inference_on_a_image.py:258: DeprecationWarning: Use set_input_shape instead. context.set_binding_shape(i, opt) trt_inference_on_a_image.py:199: DeprecationWarning: Use get_tensor_shape instead. size = abs(trt.volume(context.get_binding_shape(i))) bs trt_inference_on_a_image.py:200: DeprecationWarning: Use get_tensor_dtype instead. dtype = trt.nptype(engine.get_binding_dtype(binding)) trt_inference_on_a_image.py:209: DeprecationWarning: Use get_tensor_mode instead. if engine.binding_is_input(binding): trt_inference_on_a_image.py:220: DeprecationWarning: Use execute_async_v2 instead. context.execute_async(batch_size=1, bindings=bindings, stream_handle=stream.handle) [01/11/2024-15:58:51] [TRT] [W] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [01/11/2024-15:58:51] [TRT] [W] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. trt_inference_on_a_image.py:78: RuntimeWarning: overflow encountered in exp return 1/(1 + np.exp(-x)) Traceback (most recent call last): File "trt_inference_on_a_image.py", line 274, in <module> boxes_filt, pred_phrases = outputs_postprocess(tokenizer, output_data, box_threshold, text_threshold, with_logits=True, token_spans=None) File "trt_inference_on_a_image.py", line 143, in outputs_postprocess pred_phrase = get_phrases_from_posmap(logit > text_threshold, tokenized, tokenlizer) File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in get_phrases_from_posmap token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] File "/home/liufurui/TensorRT-8.6.1.6/targets/x86_64-linux-gnu/bin/GroundingDINO/groundingdino/util/utils.py", line 607, in <listcomp> token_ids = [tokenized["input_ids"][i] for i in non_zero_idx] IndexError: list index out of range … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:45 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) groundingdino中有文本部分,但是tokenzier无法转onnx的话,后面怎么进行tensorrt推理?或者是transformer那部分用tensorrt推理。text那部分拆出来单独写逻辑输出给transformer? … ------------------ 原始邮件 ------------------ 发件人: "open-mmlab/mmdetection" @.>; 发送时间: 2024年1月11日(星期四) 下午3:36 @.>; @.>;"State @.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) 兄台你可以写个博客。我给你充值。 … ---原始邮件--- 发件人: "Chen @.> 发送时间: 2024年1月10日(周三) 上午10:03 收件人: @.>; 抄送: @.@.>; 主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342) bert模型可以放进去,但tokennizer不能放进去,我为了方便直接就把bert模型也拿出去了 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 我这边G-DINO吧tokenizer拆除来后的动态尺寸fp32精度可以对齐,swint在a100上推理速度170ms左右,比torch推理快30%,但是fp16精度对齐不了,现在在用polygraphy debug中,有进展可以一起讨论下。 环境 tensorrt 8.6.1.6 cuda 11.7 torch转onnx reference:https://github.com/wenyi5608/GroundingDINO/blob/main/demo/export_openvino.py 动态输入:每一个输入和输出中动态的维度都要注明,下标从0开始 dynamic_axes={ "input_ids": {0: "batch_size", 1: "seq_len"}, "attention_mask": {0: "batch_size", 1: "seq_len"}, "position_ids": {0: "batch_size", 1: "seq_len"}, "token_type_ids": {0: "batch_size", 1: "seq_len"}, "text_token_mask": {0: "batch_size", 1: "seq_len", 2: "seq_len"}, "img": {0: "batch_size", 2: "height", 3: "width"}, "logits": {0: "batch_size"}, "boxes": {0: "batch_size"} } opset_version:16 onnx转tensorrt ./trtexec --onnx=/root/GroundingDINO/grounded.onnx --saveEngine=grounded.trt --minShapes=img:1x3x800x1200,input_ids:1x1,attention_mask:1x1,position_ids:1x1,token_type_ids:1x1,text_token_mask:1x1x1 --optShapes=img:1x3x800x1200,input_ids:1x6,attention_mask:1x6,position_ids:1x6,token_type_ids:1x6,text_token_mask:1x6x6 --maxShapes=img:1x3x800x1200,input_ids:1x25,attention_mask:1x25,position_ids:1x25,token_type_ids:1x25,text_token_mask:1x25x25 tensorrt推理 inference_trt.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 是的,把text那部分拆出来单独写逻辑输出给transformer — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 应该是你input_ids的长度没对齐吧,导出onnx的代码看看 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.> 代码打包来看看,有空帮你跑下 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.**> 从QQ邮箱发来的超大附件 weights.pth (1.98G, 2024年02月10日 17:47 到期)进入下载页面:https://mail.qq.com/cgi-bin/ftnExs_download?k=213961378d4f4db31b37e6251f62001d194d50030a0651064c5b04005f4f570402584c005b06041f050b54050c5b5703040b5752396932450450065f4d111c421551610a&t=exs_ftn_download&code=a9a79b22
兄弟,mmdet微调的grounding dino模型转onnx 你成功了吗,grounding dino官方的模型转onnx和tensorrt推理我都跑通了的,但是mmdet训练的还没有成功。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
|
用的tensorrt python库,不过我用的官方没有微调过的模型,mmdet训练的groundingdino模型怎么转onnx啊 |
你的tensorrt版本是多少? |
大佬,请问可以给一份转出来的onnx模型么,我自己转的貌似精度很原生torch有很大差异,方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。 |
太大了,传不上来。。 |
老哥可以给个google drive link或者发我邮箱么:469915440@qq.com,麻烦老哥了 |
https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了,tokelizer预处理单独完成 |
ok
------------------ 原始邮件 ------------------
发件人: "open-mmlab/mmdetection" ***@***.***>;
发送时间: 2024年3月19日(星期二) 中午11:31
***@***.***>;
***@***.***>;"State ***@***.***>;
主题: Re: [open-mmlab/mmdetection] 关于mmdetection微调之后 Grounding DINO 无法tensort加速问题 (Issue #11342)
大佬,请问可以给一份转出来的onnx模型么,我自己转的貌似精度很原生torch有很大差异,方法也是跟https://github.com/wenyi5608/GroundingDINO.git这个库来的。
太大了,传不上来。。
老哥可以给个google drive ***@***.***,麻烦老哥了
https://drive.google.com/file/d/1ax6tjareHAXILphOlrDa6f2nWhRv_GvB/view?usp=drive_link, 不过我这个把bert模型分离出来了,tokelizer预处理单独完成
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you modified the open/close state.Message ID: ***@***.***>
从QQ邮箱发来的超大附件
grounded_v6.onnx (656.83M, 2024年04月18日 11:44 到期)进入下载页面:https://mail.qq.com/cgi-bin/ftnExs_download?k=7d65326251f393e5426bb5701130531c401156070355545615565001521d50555c521f060056521e5c00015b5107040b5a570501372061544a0a470c5355056c4e531c0d595e193305&t=exs_ftn_download&code=8e2b70a3
|
感谢,请问这是基于swinb的么,还是swin T?
感谢,请问这是基于swinb的么,还是swin T? |
swin T |
想问一下,mmdetction里面需要改么? |
FP32的时候,swinB转成onnx去推理,能和pytorch对齐输出;但是onnx转成trt,推理结果很多异常值,想问下需要改哪些地方呢 |
请问这个有什么进展了吗? |
利用mmdetection微调自己的本地数据集之后,GroundingDINO在本地推理的时间为200-300毫秒,转为onnx推理时间为在cpu上5s一张,在GPU上推理onnx大约2s一张图像,当把onnx模型转为tensorrt模型,官方的trtexec 工具无法转换为--fp16 的trt模型。请问 mmdetection有准备将groundingDINO加速的打算吗?
The text was updated successfully, but these errors were encountered: