Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

推理ppliteseg时间问题 #3369

Closed
1 task done
Deardongfl opened this issue Jul 12, 2023 · 7 comments
Closed
1 task done

推理ppliteseg时间问题 #3369

Deardongfl opened this issue Jul 12, 2023 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@Deardongfl
Copy link

问题确认 Search before asking

  • 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

按照教程部署ppliteseg模型,将模型导出后,使用
python deploy/python/infer.py
--config output/inference_model/ppliteseg_T1/deploy.yaml
--image_path /home/dfl/resize_half/leftImg8bit_512/train/aachen/aachen_000001_000019_leftImg8bit.png
--save_dir output/result/ppliteseg_T1
--device 'gpu'
--use_trt True
--enable_auto_tune True
--benchmark True
--precision 'fp32'
进行部署推理,这里推理时间只有7.5ms,改为"int8"后最高也只有5.17ms
ppliteseg_T1_infer

查看issue说是转为onnx进行trt加速会比Paddle Inference快,可以达到论文中的速度,然而按照教程进行推理后速度并没有达到预期速度
python deploy/python/infer_onnx_trt.py
--config configs/stdcseg/stdc1_seg_cityscapes_1024x512_160k_if_RegSeg_5.yml
--width 1024 --height 512
--enable_profile
ppliteseg_T1_infer_onnx
所使用的的tensorrt为8.4.1.5 GPU为2060
Screenshot 2023-07-12 13:02:46

同时我还试了STDC1_Seg50
python deploy/python/infer.py
--config output/inference_model/stdc1_seg50/deploy.yaml
--image_path /home/dfl/resize_half/leftImg8bit_512/train/aachen/aachen_000001_000019_leftImg8bit.png
--save_dir output/result/stdc1_seg50
--device 'gpu'
--use_trt True
--enable_auto_tune True
--benchmark True
--precision 'int8'
fp32 推理速度为6ms左右,采用"int8"后最高能到4.446ms,比ppliteseg_T1快好多,使用onnx和tensorrt加速后速度为7.427ms

请问论文中的推理有没有进行量化处理 速度能达到273FPS

@Deardongfl Deardongfl added the question Further information is requested label Jul 12, 2023
@djl00
Copy link

djl00 commented Jul 12, 2023

你好,我也在复现这个代码,可以交流一下吗

@djl00
Copy link

djl00 commented Jul 12, 2023

qq3406124214

@Asthestarsfalll
Copy link
Contributor

Asthestarsfalll commented Jul 13, 2023

没有达到论文中的速度可能是显卡的性能问题,2060没有1080ti性能好。
至于为什么stdc1_seg50比ppliteseg速度还要快,可能还是设备的影响。

@Deardongfl
Copy link
Author

应该是使用了量化的,没有达到论文中的速度可能是显卡的性能问题,2060没有1080ti性能好。 至于为什么stdc1_seg50比ppliteseg速度还要快,可能还是设备的影响。

老哥知不知道他的量化大概是怎么做的?速度问题我感觉应该花费在UAFM模块上

@Asthestarsfalll
Copy link
Contributor

应该是使用了量化的,没有达到论文中的速度可能是显卡的性能问题,2060没有1080ti性能好。 至于为什么stdc1_seg50比ppliteseg速度还要快,可能还是设备的影响。

老哥知不知道他的量化大概是怎么做的?速度问题我感觉应该花费在UAFM模块上

可以参考一下此处的文档,既然没有特别提及,我推测应该是默认的部署配置。

@ToddBear
Copy link
Collaborator

以上回答已经充分解答了问题,如果有新的问题欢迎随时提交issue,或者在此条issue下继续回复~
我们开启了飞桨套件的ISSUE攻关活动,欢迎感兴趣的开发者参加:PaddlePaddle/PaddleOCR#10223

@Daniel-969
Copy link

请问模型导出的时候出现这个错误是怎么回事呀
File "/home/daniel/anaconda3/envs/paddle_env/lib/python3.8/site-packages/paddle/tensor/creation.py", line 789, in to_tensor
return _to_tensor_static(data, dtype, stop_gradient)
File "/home/daniel/anaconda3/envs/paddle_env/lib/python3.8/site-packages/paddle/tensor/creation.py", line 674, in _to_tensor_static
to_stack_list[idx] = _to_tensor_static(
File "/home/daniel/anaconda3/envs/paddle_env/lib/python3.8/site-packages/paddle/tensor/creation.py", line 697, in _to_tensor_static
output = assign(data)
File "/home/daniel/anaconda3/envs/paddle_env/lib/python3.8/site-packages/paddle/tensor/creation.py", line 2133, in assign
raise TypeError(

TypeError: The type of received input == object, it is not supported to convert to tensor, such as [[Var], [Var], [3], [4]]
报错代码在这句
norm = paddle.cast(paddle.to_tensor([[[[out_w, out_h]]]], 'float32'), y.dtype)
好像是因为to_tensor函数出现错误,但是我在训练的时候并没有问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants