Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于tensorrt的加速效果 #19

Closed
beizhengren opened this issue Jul 17, 2020 · 6 comments
Closed

关于tensorrt的加速效果 #19

beizhengren opened this issue Jul 17, 2020 · 6 comments

Comments

@beizhengren
Copy link
Contributor

@enazoe 您好,
感谢您的无私贡献.
1.环境
Ubuntu 18.04,
1060TI,
Tensorrt 7,
CUDA 10.0,
Opencv 3.3.1

2.问题:
我在本地pc测试sample_detector, 用的默认的FP32, yolov4:
sample_detector的inference在16fps.
darknet yolov4的inference也在15fps
感觉相差不大.请问是不是哪里不正确?感谢

@enazoe
Copy link
Owner

enazoe commented Jul 17, 2020

@beizhengren 其实sample_detector的time是包括pre-process和post process,inference time是比detect time短;darknet是inference time 不包括pre process和post process。
而且,darknet的计算复杂度和fp32下的复杂度本身就是接近的,只是trt降低了显存的使用

@beizhengren
Copy link
Contributor Author

@enazoe
按照您说的, 我试了下只对doinference做测试, 帧率在20fps.
是的,我测试darknet显存大概占用1.5G, 而您的工程在1G左右.

  1. 请问, configs/calibration_images.txt 这个文件我看都是图片的路径,这个是给INT8用的吗?
    如果是的话, 这些图片的作用是啥(我对INT8不是很了解,不好意思)
  2. 另外,CMakeLists.txt中
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++14 -Wno-write-strings")

我在编译的时候改成了c++14, 否则会有bug.
谢谢

@enazoe
Copy link
Owner

enazoe commented Jul 17, 2020

@beizhengren
1.是的,无标签图片是用于int8校准的,就是因为int8序列化时候可表示的位数变少,需要训练的图片做校准
2.改为c++14不确定会不会有问题,不过你可以试试。

@beizhengren
Copy link
Contributor Author

@enazoe
感谢!
1.那这些图片是从训练集中随机选取就可以吗?对数量有没有要求呢?
2.c++11会报找不到make_unique().可能是我本地的环境吧.

@enazoe
Copy link
Owner

enazoe commented Jul 17, 2020

@beizhengren
没有要求,100张左右吧,记得好像是在官网看到过,但是记不太清了,越多校准时间越长

@beizhengren
Copy link
Contributor Author

@enazoe
好! 我明白了,等有其他问题再向您请教
感谢!

@enazoe enazoe closed this as completed Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants